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ABSTRACT 


To  date.  In  spite  of  much  speculation,  no  computer-aided  testing 
techniques  for  software  have  been  evaluated  In  a controlled  testing 
environment.  This  report  discusses  and  presents  the  results  of  a series 
of  such  tests. 

is) 

The  techniques  evaluated  are  path  (branch)  coverage  testing  and 
static  analysis.  The  basic  approach  was  to  prepare  programs  for  testing 
by  seeding  them  with  errors  whose  type  and  frequency  are  typical  of  new 
software  at  the  integration-  or  system-level  of  testing. 


The  experiments  were  conducted  In  three  phases.  The  first  used 
eight  small  programs  from  a popular  programming  manual,  the  second  and 
third  used  a 5000-llne  FORTRAN  program  used  to  simulate  balllstlc-mlsslle 
defense  engagements.  For  the  most  part,  both  the  path  testing  and  static 
analysis  used  the  SQLAB  tool,  with  the  techniques  used  singly  and  In  com- 
bination. In  Phase  1,  the  DAVE  system's  static  analysis  capabilities 


were  also  used.  In  Phase  3,  the  techniques  were  compared  with  the  tech- 
niques of  intermediate-value  printout  and  control-flow  tracing. 

Of  the  two  techniques,  path  testing  was  mp»«''ef fectlve  overall.  Its 
lack  of  localized  error  messages  was  adrdfmack,  but  the  enhancement  to 
the  Inspection  process  was  slgjttfT^nt,  doubling  the  usual  Inspection 
yield.  Static  analysl^.,.-ijfille  not  as  powerful,  at  times  detected  errors 
path  testing find.  It  Is  economical,  and  Its  diagnostic  message 
at  the^rtlr's  statement  location  Is  a distinct  advantage. 


The  Inescapable  conclusion  remains,  however,  that  fully  automated 
computer-aided  testing  Is  not  possible  at  present.  Further,  the  errors 
that  are  not  detected  are  generally  considered  difficult  to  locate  by 
conventional  techniques.  In  particular,  the  missing  Ingredient  seems  to 
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be  a specification  of  the  legal  path  sequences  which  a program  

be  allowed  to  travel.  The  error-seeding  process  is  recommended  as  j 
measure  of  testing  thoroughness. 
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INTRODUCTION 


This  report  describes  the  procedures  and  results  of  a series  of 
controlled  experiments  designed  to  gather  data  on  actual  test  tool  usage. 
The  primary  goal  of  these  experiments  was  to  evaluate  and  compare  two 
automated  testing  techniques,  path  (branch)  coverage  testing  and  static 
analysis,  by  determining  the  types  of  errors  each  is  capable  of  locating 
and  measuring  the  computer  and  engineering  time  the  techniques  require 
to  detect  each  type  of  error. 

An  additional  goal  of  the  experiment  was  to  observe  and  compare  the 
relative  testing  effectiveness  in  a multi-error  environment  of  a test  tool 
capable  of  both  path  testing  and  static  analysis  and  a sophisticated 
compiler  having  automated  intermediate  value  printout  and  execution 
tracing  features. 

The  experiments  were  successful  in  providing  data  on  error  detec- 
tion rates  and  level-of-ef fort  required  for  finding  specific  types  of 
errors.  They  also  provided  a background  for  analyzing  parallel  testing 
strategies  in  which  the  human  element,  as  well  as  the  testing  tool  tech- 
nique, plays  a significant  role  in  the  software  testing  effort.  One  of 
the  most  important  byproducts  of  the  error-seeding  activity  was  to 
indicate  the  acute  vulnerability  of  software,  especially  to  errors 
which  can  mask  each  other  or  which  never  appear  for  any  but  the  most  ex- 
haustive test  data. 

1.1  BACKGROUND 

Histories  of  several  large  software  development  projects  have  shown 
that  roughly  half  the  cost  of  bringing  such  a project  to  operational  capa- 
city is  incurred  in  "testing"  the  software  after  the  developer  (or) 
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the  schedule)  had  declared  the  product  completed.  ’ In  general,  this 

type  of  testing  is  Intended  to  demonstrate  that  the  software  is  ready  for 

operational  use;  in  fact,  a large  portion  of  such  testing  is  devoted  to 

detecting  and  correcting  errors  that  have  gone  undetected  during 

development.  To  assist  in  this  difficult  process  of  testing,  various 

computer-aided  techniques  have  been  devised  and  the  necessary  supporting 

tools  developed.  The  value  of  such  computer-aided  testing  techniques  has 

3 

been  both  challenged  and  supported  extensively.  In  the  few  published 
studies  on  the  subject  that  reported  the  use  of  test  tools,  there  is 
disagreement  on  their  effectiveness.  In  none  of  the  studies  on  medlum- 
or  large-scale  software,  however,  have  the  evaluations  been  made  in  a 
controlled  testing  environment  in  which  automated  tools  were  actually 
used.  The  goal  of  this  project  was  to  run  a series  of  controlled  experi- 
ments to  gather  data  based  on  actual  test  tool  usage. 

4 

Goodenough  states  that  40-92  percent  of  errors  could  be  found 
using  path-testing  techniques.  He  stresses  that  the  limitations  of  path 
testing  have  not  been  adequately  described  and  that  a false  sense  of  con- 
fidence of  program  correctness  may  develop  if  only  path-testing  methods 
are  used.  However,  Goodenough 's  view  of  path  testing  excludes  the 
functionality  of  the  data,  thereby  limiting  the  testing  process  to 
structural  path  execution.  We  stress  that  path  testing  is  not  Intended 
to  be  performed  without  respect  paid  to  the  "reasonableness"  of  the  input 
data. 

^B.  W.  Boehm,  "Software  Engineering:  R & D Trends  and  Defense  Needs," 
Proceedings  of  the  Conference  on  Research  Directions  in  Software 
Technology.  October  1977. 

2 

D.  S.  Alberts,  "Economics  of  Software  Quality  Assurance,"  AFIPS 
Conference  Proceedings.  Vol.  45,  National  Computer  Conference,  1976. 

3 

D.  J.  Reiffer  and  R.  L.  Ettenger,  "Test  Tools:  Are  They  a Cure-All?" 
Proceedings  of  the  1975  Annual  Reliability  and  Maintenance  Symposium. 
IEEE  75CH0918-3R0C,  January  1975. 

4 

J.  B.  Goodenough,  "A  Survey  of  Program  Testing  Issues,"  Proceedings 
of  the  Conference  on  Research  Directions  in  Software  Technology 
October  1977. 
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The  few  studies  that  report  quantitative  results  for  analyzing  the 

effectiveness  of  path  testing  are  in  disagreement.  Hetzel^  states  that 

2 

path  testing  is  of  "little  value"  in  the  detection  of  errors.  Gannon 
states  that  systematic  functional  and  structural  testing  using  a well- 
defined  test  plan  and  a path-testing  tool  produced  an  error  rate  of  0.3% 
after  acceptance  test  for  the  large  JOVIAL  program.  The  disagreement  of 
errors  found  by  path  testing  is  further  shown  in  Table  1.1. 

While  Mangold  states  that  92%  of  the  errors  in  a program  might  be 
found,  Howden  and  Goodenough  state  that  perhaps  50%  might  be  found. 


The  word  "might" 

is  used  because,  except 

for  Gannon's 

work,  no  path- 

testing  tool  was 

used  to  obtain  the  quoted  figures. 

This  lack  of 

results  has  lead 

to  the  widely  divergent 

opinions  on 

the  value  of  path 

testing. 

TABLE  1.1 

THEORETICAL  RESULTS  OF  PATH  TESTING 

Total 

Path  (branch) 

Errors 

Testing  Detects 

% Detected 

Source 

22 

7-14 

40-65 

* 

Howden 

28 

6 

21 

Howden"^ 

224 

206 

92 

§ 

Mangold 

7 

7 

50 

Goodenough^ 

W.  E.  Howden,  "Symbolic  Testing  and  the  DISSECT  Symbolic  Evaluation 
System,"  Computer  Science  Technical  Report  II.  University  of  California, 
San  Diego,  May  1976. 

^W.  E.  Howden,  "Theoretical  and  Empirical  Studies  in  Program  Testing," 

IEEE  Transactions  on  Software  Engineering.  Vol.  SE-A,  No.  4,  July  1978. 

§ 

E.  R.  Mangold,  Software  Error  Analysis  and  Software  Policy  Implications," 
IEEE  EASCON,  1974,  pp.  123-127. 

^Goodenough,  op.clt. 


^W.  C.  Hetzel,  An  Experimental  Analysis  of  Program  Verification  Methods. 
Thesis,  University  of  North  Carolina,  Chapel  Hill,  N.  C.,  1976. 

2 

C.  Gannon,  "A  Verification  Case  Study,"  Proceedings  of  AIAA  Computers 
in  Aerospace  Conference.  Los  Angeles,  November  1977. 
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Howden's  results  are  based  on  the  analysis  of  errors  in  very  small 

programs  (fewer  than  30  statements).  These  programs,  taken  from  Kernighan 
2 

and  Plauger,  contain  examples  of  common  programming  blunders  and  provide 
a common  basis  for  comparison.  Howden,  however,  did  not  use  a test  tool 
for  his  analysis.  Hence,  for  the  first  phase  of  our  testing  experiments 
we  subjected  these  programs  to  actual  path  testing  and  static  analysis. 

A few  of  these  programs  were  written  in  PL/1  and  had  to  be  translated 
into  FORTRAN  so  that  test  tools  could  be  used. 

Very  early  in  the  experiments,  we  found  that  "error"  is  a very 
ambiguous  concept.  In  any  software  system,  designers  and  programmers 
take  certain  liberties  based  on  the  generality  of  the  program,  the  pro- 
gramming language  and  operating  system  used,  and  the  requirements  for 
meeting  size  and  speed  limitations.  In  an  environment  that  tries  to  en- 
force very  strict  coding  standards,  ambiguous  comments  and  Intentional 
mixed  mode  might  be  called  errors.  For  our  purpose,  we  defined  an  error 
as  any  construct  that  (1)  appeared  to  violate  the  program's  specification, 
or  (2)  relied  on  nonstandard  characteristics  of  a compiler,  operating 
system,  or  computer. 

1.2  PURPOSE  OF  EXPERIMENTS 

Two  software  testing  techniques,  static  analysis  and  dynamic  path 

3 

(branch)  testing,  are  currently  receiving  a great  deal  of  attention  in 
the  world  of  software  engineering.  However,  empirical  evidence  of  their 
ability  to  detect  errors  is  very  limited,  as  is  data  concerning  the  re- 
source Investment  their  use  requires.  Researchers  have  estimated  or 
intuitively  graded  these  testing  methods,  as  well  as  such  other  techniques 
as  Interface  consistency,  symbolic  testing,  and  special  values  testing. 

^Howden,  1976,  op.cit. 

2 

B.  W.  Kernighan  and  P.  J.  Plauger,  The  Elements  of  Programming  Style. 
McGraw-Hill,  1974. 

3 

R.  E.  Fairley,  "Tutorial:  Static  Analysis  and  Dynamic  Testing  of 
Computer  Software,"  Computer.  April  1978 
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This  project  seeks  (1)  to  demonstrate  empirically  the  types  of  errors 
one  can  expect  to  uncover,  (2)  to  measure  the  engineering  and  computer 
time  which  may  be  required  by  the  two  testing  techniques  for  each  class 
of  errors,  (3)  to  analyze  the  relative  merits  of  a test  tool  containing 
both  testing  capabilities  and  a compiler  containing  automated, 
intermediate-value  and  trace  capabilities,  and  (4)  to  direct  attention 
to  near-term  tool  enhancements,  based  on  the  experimental  evidence. 

The  experiments  for  this  project  were  conducted  in  three  phases. 

The  first  phase  examined  the  small  programs  from  Kernighan  and  Plauger 

using  the  static  analysis  and  path  testing  capabilities  of  SQLAB^ 

2 

separately  and  the  static  analysis  capabilities  of  the  DAVE  system. 

These  experiments  were  performed  as  a preliminary  analysis  of  the  two 
testing  techniques.  The  second  phase  of  experiments  was  conducted  to 
determine  the  types  of  errors  that  static  analysis  and  path  testing  are 
capable  of  detecting  during  system-level  testing.  The  experiments  in- 
volved seeding  one  error  at  a time  Into  a medium-sized  program  and  then 
recording  the  detection  rate  and  the  resources  required  by  each  error 
detection  method.  The  third  phase  of  experiments  was  designed  to  eval- 
uate the  effectiveness  of  static  analysis  and  path  testing  in  a multi- 
error environment.  In  this  phase  the  two  testing  techniques  are  compared 
with  the  classical  techniques  of  Intermediate  value  printout  and  exe- 
cution tracing  automated  by  a sophisticated  compiler.  The  complete  set 
of  experiments  Is  summarized  in  Table  1.2. 


S.  M.  Andrews  and  J.  P.  Benson,  Software  Quality  Laboratory  User's 
Manual , General  Research  Corporation  CR-4-770,  May  1978. 

D.  Fosdlck  and  C.  Miesse,  The  DAVE  System  User's  Manual, 
University  of  Colorado,  CU-CS-106-77,  March  1977. 
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TABLE  1.2 


SET  OF  EXPERIMENTS 


Phase 


Purpose 


Test  Object 


Preliminary  analysis:  Compar-  Eight  small  pro- 


Ison  of  empirical  results 
with  published  theoretical 
results 

Determination  of  types  of 
errors  which  can  be  found 
(single-error  experiment) 

Evaluation  of  a test  tool 
In  a multi-error  experiment 


grams  from  The 
Elements  of  Pro- 
gramming Style 

5000-llne  trajec- 
tory analysis 
FORTRAN  program 

5000-llne  trajec- 
tory analysis 
FORTRAN  program 


Test  Technique (Tool) 

Path  testing  (SQLAB) 
Static  analysis  (SQLAB) 
Static  analysis  (DAVE) 


Path  testing  (SQLAB) 
Static  analysis  (SQLAB) 


Path  testing  and 
Static  analysis  (SQLAB) 
debugging/ trace  com- 
ller  (CDC  FTNX) 


1.2.1  Description  of  Path  Testing 

Path  testing  Is  based  upon  the  assumption  that  executing  all  the 
paths  In  a program  is  sufficient  to  reveal  a large  fraction  of  the 
errors  when  the  program  Is  executed.  Or,  stated  another  way,  paths  which 
have  never  been  tested  may  harbor  errors.  The  only  practical  way  to  system- 
atically check  the  execution  of  each  path  Is  by  using  an  automated  path- 
testing  tool. 


The  first  step  In  path  testing  Is  to  develop  a graph  model  of  the 
program  using  the  tool  to  identify  all  the  paths  through  It. 

This  graph  model  Is  composed  of  an  Input  node  which  represents  all  entry 
points  to  the  program,  an  output  node  which  represents  all  possible 
termination  or  exit  points  from  the  program,  and  a set  of  nodes  which 
represent  all  the  possible  branching  points  In  the  program.  The  nodes 
are  connected  by  links  which  represent  statements  In  the  program  which 
are  executed  sequentially  between  any  two  branch  points.  Note  that  this 
model  assumes  that  the  destination  of  all  branch  points  in  the  program 
can  be  determined  statically.  That  Is,  dynamic  definition  of  branch 
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points  (as  in  FORTRAN-asslgned  GOTO  statements  when  the  statement  label 
list  Is  not  Included)  Is  not  allowed  by  this  model. 


In  general.  It  Is  Impractical  and  unnecessary  to  test  all  possible 
paths  through  a program.  The  number  of  paths  Increases  drastically  with 
the  number  of  branches  and  loops  It  contains.  For  this  reason,  the 
criterion  of  testing  all  paths  through  the  program  is  relaxed  and  re- 
placed by  the  requirement  to  exercise  all  of  the  links  (or  segments)  in 
the  program  graph.  These  links  correspond  to  all  the  straight  line  code 
executed  in  the  program  between  branch  points  and  are  called  "segments" 
or  "declslon-to-declslon  paths"  (DD-paths).  Note  that  by  relaxing  the 
testing-all-paths  criterion  to  the  testing-all-segments  criterion,  we 
Implicitly  assume  the  sequential  Independence  of  segments.  However,  ex- 
perience has  shown  that  the  order  of  segments  is  Important,  thus  empha- 
sizing one  aspect  of  the  path- testing  methodology:  Input  data  must  re- 
flect the  functional  requirement  In  order  to  execute  the  paths  in  their 
Intended  order. 

Path  segment  testing  (known  in  this  report  as  "path  testing,"  and 
having  the  same  meaning  as  "branch"  or  "segment"  testing)  is  usually 
accomplished  In  the  following  manner.  A set  of  test  data  that  results 
In  correct  execution  of  the  program  is  taken  as  the  basic  test  case. 

Using  this  test  case,  the  program  is  executed  and  measurements  are  taken 
of  the  number  of  path  segments  executed  by  the  basic  test  data.  The  data 
values  in  the  basic  test  data  which  have  an  effect  upon  the  decision 
(branch)  points  in  the  program  are  then  altered  so  that  every  path  seg- 
ment is  exercised  by  the  set  of  test  data  developed  in  this  manner; 
the  program  output  is  examined  for  errors,  and  any  execution- time  errors 
are  recorded.  This  process  is  extremely  dependent  upon  the  ability  of 
the  tester,  aided  by  the  test  tool,  to  derive  data  input  values  which 
result  in  all  path  segments  being  executed. 
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1.2.2  Description  of  Static  Testing 

Although,  in  Its  current  state  of  development,  static  analysis  Is 
not  able  to  demonstrate  the  functional  correctness  of  a program  It  is 
easy  to  use  and  can  detect  a number  of  program  errors.  The  static  analy- 
sis capabilities  of  the  testing  tool  are: 

1.  Set/use  checking  - warning  of  local  variable  usage  without 
prior  setting  or  local  variable  setting  with  no  subsequent 
usage. 

2.  Call  checking  - the  number  and  type  of  actual  parameters  for 
each  Invocation  are  checked  against  the  number  and  type  of 
formal  parameters. 

3.  Mode  checking  - the  left  and  right  side  of  assignment 
statements  are  analyzed  for  type  consistency. 

4.  Graph  checking  - the  control  flow  graph  Is  analyzed  for 
structurally  unreachable  code  and  loops  In  which  the  control 
variable  Is  not  changed. 

Even  small  programs  can  contain  errors  not  easily  visible  In  the 
source  listing.  Figures  1.1  and  1.2  show  a sample  program  listing  and 
static  analysis  report  generated  by  SQLAB.  Except  for  set/use  checking, 
the  error  and  warning  messages  appear  at  the  appropriate  source  state- 
ment. Error  location  definition  Is  an  advantage  which  path  testing  does 
not  have. 

1.3  MAJOR  CONCLUSIONS 

The  set  of  experiments  provided  evidence  for  assessing  the 
effectiveness  of  separately  using  two  automated  testing  techniques  for 
detecting  errors  of  the  following  categories:  computational,  logic. 
Input/output,  data  handling.  Interface,  data  definition  and  database. 

Also  provided  by  the  experiments  were  the  amounts  of  engineering  and 
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ST*T1C  «N*LTS1S 


SUBMOUTIMC  BSORT  I NUN  t ARRAY  I 


sthi  iccnt.line  source... 

7 CALL  Error  « nlm  i 

• Call  error 

• ERKCH  CALLER  WIIH  1 ACTUALLY  HAS  2 ARGuMeNTS 

S*  IFlAG  z .TRUE. 

• Graph  warmino 

• STAIEMENY  39  IS  UNREACHABLE  OR  Is  IN  AN  INFINIyE  LOOP 


• mOoc 

• left  hand  stOE  HAS  NODE  INTEGER 

WARNING 

RIGHT  hand  side  HAs  NOOE  LOGICAL  • 

statement  analysis  sunpary 

ERRORS  WARNINGS 

giyaph  checking 
CALL  CHECKING 
mode  checking 

0 

3 

0 

1 

0 

I 

name 

SCOPE 

type 

ROOF 

usr 

OTHER 

InFORKATIoN... 

IFlAG 

LOCAL 

VARIABLE 

INlEGER 

OUTPUT 

• variable  IFLAG  SEt 

• 39 

set/usE  warning 

BUT  never  used 

Refer  to  sTatehentisi- 

RAANUN 

LOCAL 

variable 

INTEGER 

input 

SCT/USE  ERROR 

• 

> variable  haxnun 

• 9 G 

used  but  never  set 

refer  To  STaTEWCNTISI* 

S1NO0L  ANALYSIi  SuHRARY 


SEY/USE  checking 

the  FCLLGhINC  nonlocal  variables  arc  SET... 

ARRAl 


errors  warnings 
S 1 


Figure  1.2.  Sample  Static  Analysla  Report  from  SQLAB. 
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computer  time  expended.  Table  1.3  summarizes  the  rate  of  error  detection 
and  resources.  Detailed  results.  Including  detection  rates  for  each  type 
of  error  within  each  category,  are  provided  in  Sec.  5.  The  computer  pro- 
gram used  as  a test  object  for  most  of  the  experiments  is  described  in 
Sec.  3,  and  each  error  type  and  frequency  used  in  the  experiments  is  des- 
cribed in  Sec.  4. 

As  Table  1.3  indicates,  and  Sec.  5 describes  more  fully,  path 

j 

testing  is  more  effective  than  static  analysis  at  detecting  and  locating 
computational,  logic,  and  database  errors.  Even  so,  the  rate  of  detec- 
tion and  amount  of  engineering  time  required  by  path  testing  show  it  is 


TABLE  1.3 

SUMMARY  OF  ERROR  CATEGORY  DETECTION 


Detection 

Rate  (%) 

, * 

Resources  E/C 

Static 

Path  ^ 
Testing' 

Static 

Path  ^ 

Error  Category 

Analysis 

Analysis 

Testing 

Computational 

14 

58 

4.0/12.7 

Logic 

14 

63 

3.5/11.7 

Input /Output 

17 

17 

1.0/14.7 

Data  Handling 

28 

28 

2. 5/7.0 

Interface 

25 

25 

4.0/19.5 

Data  Definition 

25 

25 

1. 0/5.0 

Data  Base 

0 

38 

2.0/13.9 

Total 

16% 

45% 

2.0/24.0® 

3.1/11.9^ 

E = engineering  hours  (average  per  error  category). 

C = CDC  7600  computer  seconds  (average  per  error  category) . 

As  a baseline,  complete  compilation  and  execution  took  10  seconds. 

^Path  testing  combined  with  inspection  aided  by  path  testing. 

g 

All  49  errors  seeded  simultaneously. 

^Average  of  all  errors  detected  by  path  testing. 
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not  sufficient  for  use  as  the  sole  program  verification  or  error  detec- 
tion technique,  and  It  Is  rather  time-consuming.  Static  analysis  re- 
quires much  less  engineering  and  computer  time  (per  error) , but  the  pay- 
off In  finding  errors  of  a system-level  nature  Is  not  as  great. 

The  multiple-error  experiment  Indicated  that  an  automated  means  of 

printing  Intermediate  results  and  tracing  program  execution  Is  more 
effective  for  locating  errors  than  the  combination  of  path  coverage 
testing  and  static  analysis.  The  data  gathered  In  this  experiment  are 
presented  In  Sec.  6.  The  conclusion  drawn  from  the  analysis  of  this  data 
is  that  redundant  functional  information  embedded  In  programs  is  neces- 
sary for  automated  tools  to  be  more  effective. 

An  important  outcome  of  the  error-seeding  activity  was  that  when 
program  verification  Is  based  on  demonstrating  complete  path  coverage, 
one  can  still  expect  approximately  25  percent  of  the  program  errors  to 
remain.  Path  testing  depends  upon  some  manifestation  of  an  error  In  the 
program  output.  We  found  that,  when  known  errors  were  inserted,  and  the 
program  was  executed  with  complete  coverage  data  derived  from  path  test- 
ing, 25  percent  of  the  errors  did  not  cause  any  change  in  the  output. 
These  errors  were  not  used  In  the  experiments. 

It  Is  possible  that  many  of  those  errors  are  harmless  in  one 
specific  application  of  a general  purpose  program  (e.g.,  incorrect 
computations  are  never  used  or  are  corrected  before  harm  Is  done).  It  Is 
more  likely,  however,  that  the  data  generated  to  satisfy  the  path  testing 
requirements  of  a specified  percentage  of  coverage  cause  control  flow  to 
execute  sequences  of  paths  which  do  not  exhibit  the  errors.  This  Is  one 
reason  why  path  testing  should  always  be  coupled  with  stress  or  boundary 
condition  testing.  Overall  path  coverage  may  not  be  Increased,  but  the 
>^^8ht  sequence  of  paths  may  be  executed  to  expose  errors. 
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This  set  of  experiments  reinforced  the  intuitive  feeling  that 
error  detection  is  a difficult  and  highly  Individual  process.  Even 
armed  with  test  tools,  complete  software  verification  is  still  very 
much  a function  of  human  intuition  and  resourcefulness.  The  software 
testing  process  should  not  depend  entirely  on  any  single  current  state- 
of-the-art  technique  but  should  encompass  as  many  tools  as  is  practical. 
Attempting  to  detect  seeded  errors  of  specified  type  and  frequency 
during  system-level  or  acceptance  testing  provides  a valid  measure  of 
test  data  thoroughness  (e.g. , did  the  execution  output  show  the  presence 
of  the  seeded  error?)  and  fault  tolerance  of  the  software  (e.g.,  did 
other  parts  of  the  software  correct  the  error?) 

It  appears  that,  until  software  specification  and  implementation 
through  a computer  language  are  more  integrated  and  standardized, 
software  testing  will  never  be  an  automated  process. 


I 
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PRELIMINARY  ANALYSIS 


Eight  small  programs  from  The  Elements  of  Programming  Style^  were 
tested  using  the  static  and  path  testing  capabilities  of  SQLAB  and  the 
static  analysis  capability  of  the  DAVE  system.  These  programs,  all 
under  30  source  lines,  performed  such  functions  as  table  lookup,  binary 
search,  and  computing  electrical  current.  Listings  of  these  programs 
are  Included  in  Appendix  A. 

There  were  two  motives  for  spending  any  time  at  all  on  these  small 
programs:  curiosity,  and  the  fact  that  Goodenough  and  Howden  both  have 
based  comments  regarding  the  validity  of  path  testing  on  these  programs. 
Neither  researcher,  however,  used  an  actual  path  testing  tool  in  making 
their  judgements.  Table  2.1  presents  our  results  from  analyzing  the 
eight  programs.  We  have  categorized  the  errors  found  into  three 
levels.  We  consider  the  first  level  errors  the  most  serious  in  terms 
of  their  impact  on  computed  results  and  possible  cost  (in  a non-test-tool 
environment)  to  detect.  The  third  level  errors  are  the  least  serious. 


For  these  same  programs,  Howden  said  that  40-65  percent  of  the 
errors  might  (he  did  not  actually  use  a tool)  be  found  using  path  test- 
ing. Our  experience  was  that  70  percent  of  the  errors  were  found  by 
path  testing.  When  the  programs  were  subjected  to  both  static  analysis 
and  path  testing,  38  percent  of  the  errors  were  detected  by  static 
analysis  and  another  (no  overlapping  errors  considered  by  the  path 
tester)  38  percent  were  found  by  path  testing.  The  errors  are  further 
described  in  Tables  2.2  - 2.4. 
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Kernighan  and  Plauger, 
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TABLE  2,1 


ERROR  CLASSIFICATION  AND  DETECTION  RESULTS  FOR  PROGRAMS 
FROM  THE  ELEMENTS  OF  PROGRAMMING  STYLE 


CATEGORIES  NUMBER  OF  ERRORS 


Static 

: and 

Path 

Path 

Testing 

Testing  Combined 

Alone 

S 

P 

X 

P 

X 

S 

-►  static 

Level  1. 

analysis 

Uninitialized  variables 

7 

0 

0 

7 

0 

P 

-*•  path 

Computational  logic 

0 

2 

1 

2 

1 

testing 

Loop  Logic 

0 

5 

1 

5 

1 

X 

-*■  undetected 

Level  2. 

error 

Unchecked  array  boundary 

0 

0 

1 

0 

1 

Equality  comparison 

0 

1 

1 

1 

1 

Level  3. 

Improper  termination 

1 

1 

0 

2 

0 

Mixed  mode 

1 

1 

2 

1 

3 

Unused  variables 

1 

0 

0 

0 

1 

totals 

10 

10 

6 

18 

8 

Total  Errors  = 26 

percent 

38% 

38% 

24% 

70% 

30% 

In  this  small  exercise,  path  testing  alone  uncovered  most  ot  the 
errors  found  by  static  analysis.  However,  errors  detected  by  static 
analysis  used  but  a fraction  of  the  resources  path  testing  required. 

In  addition,  the  static  analyzer  points  out  errors  explicitly.  The 
DAVE  system  detected  all  the  errors  SQLAB  did,  with  the  exception  of 
one  mixed  mode  and  one  improper  program  termination  error.  In  path 
testing,  the  execution  output  must  be  studied  for  possible  errors,  and 
the  execution  coverage  reports  must  be  reviewed  to  determine  what  paths 
were  taken  when  erroneous  behavior  was  exhibited.  If  there  is  little 
output  produced  in  a program,  then  the  tester  may  have  to  add  printout 
statements  to  display  intermediate  results  as  paths  are  executed. 
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TEST  OBJECT 


The  program  selected  for  Phases  2 and  3 as  the  test  object  for 
error  seeding  is  an  example  program  from  the  TRAID  subroutine  package.^ 
TRAID,  a GRC  software  product  developed  In  1968  to  help  solve  missile 
trajectory  problems,  contains  105  modules  primarily  for  calculating 
powered  and  gulded-f light  trajectories  and  Keplerlan  orbits.  It  also 
Includes  support  routines  for  vector  and  matrix  operations,  conversion 
of  units  of  measure,  plotting,  and  report  generation.  TRAID  has  been 
In  continuous  use  at  GRC  since  1968  and  has  required  very  few  changes 
or  modifications  over  this  period. 

The  test  program  computes  the  closest  approach  between  an  ICBM  and 
an  Interceptor  missile.  Data  for  the  program  Includes  descriptions  of 
the  ICBM's  trajectory  and  the  Interceptor's  flight  characteristics, 
(I.C.,  thrust,  mass,  burn  time,  drag,  etc.)  and  a schedule  of  Inter- 
ceptor maneuvers. 

The  test  program  employs  57  TRAID  routines  which  expand  to  ap- 
proxlmatelv  5000  lines  (over  3000  complete  statements)  of  FORTRAN  code. 
This  program  was  selected  for  error-seeding  because  It  Is  stable,  be- 
lieved to  be  bug-free,  and  large  enough  to  constitute  a realistic  de- 
bugging problem. 

3.1  MODIFICATION  OF  THE  TEST  OBJECT 

A number  of  modifications  were  made  to  the  test  program  to  re- 
place some  of  the  non-ANS-standard  FORTRAN  code  which  the  SQLAB  test 
tools  would  not  accept,  correct  errors  found  during  static  program 
testing,  and  enable  the  program  to  process  multiple  test  cases  in  a 
single  run. 


^T.  Plambeck,  The  Compleat  Traldsman,  General  Research  Corporation, 
IM-711/2,  September  1969. 
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3.1.1  ANS  Standard  Corrections 


A lenient  compiler  and  unenforced  coding  standards  contributed  to 
approximately  167  lines  of  non-standard  FORTRAN  code  which  could  not  be 
l_  recognized  by  the  SQLAB  test  tool.  Three  types  of  Illegal  code  had  to 

be  corrected:  multiple  assignment  statements,  multiple  statements  per 
line,  and  an  alien  form  of  DATA  statement.  Functionally  Identical  ANS- 
standard  FORTRAN  code  was  substituted  for  the  offending  statements. 

3.1.2  Static  Analysis 

Static  analysis  of  the  unseeded  test  program  using  SQLAB  revealed 
several  potential  sources  of  error.  For  example.  In  one  case  two 
locally  declared  arrays  were  assumed  to  occupy  contiguous  storage 
space.  The  second  array  was  used  as  an  overflow  area  when  the  first 
array  was  filled.  Data  could  be  read  Into  the  second  array  but  was 
only  referenced  by  over-subscripting  the  first  array.  This  error  was 
Indicated  by  SQLAB' s set/use  checking  facility  since  the  contents  of 
the  second  array  were  set  but  never  used. 

Other  errors  Included  Incorrect  array  dimensions  and  a number  of 
mode  violations  for  data  types  Involving  character  (Hollerith)  data. 

None  of  the  errors  found,  however,  appeared  to  have  any  consequences 
either  to  the  operation  of  the  program  or  to  the  printed  results  for 
the  example  test  data  set.  "Error"  as  Is  used  here  means  a violation 
of  the  language  definition  or  a dependency  on  the  non-standard  charac- 
teristics of  a particular  compiler,  operating  system,  or  machine. 

3.1.3  Multiple  Test  Cases 

The  test  program  was  further  modified  to  enable  the  processing 
of  multiple  test  cases  In  a single  run.  The  main  program  and  two  of  the 
TRAID  routines  were  adapted  for  this  purpose.  The  multiple  test  case 
capability  was  originally  Intended  to  simplify  the  testing  process.  An 
added  advantage  Is  that  a significant  portion  of  TRAID' s data  manipu- 
lation facilities  are  now  exercised  by  the  test  program. 
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3.2  Expanded  Data  Set 

The  original  test  data  set  taken  from  the  TRAID  user’s  manual 
exercised  50  percent  of  the  total  paths  in  the  test  program.  SQLAB's 
instrumentation  facilities  and  the  trace  file  analysis  program  were  used 
to  create  additional  test  cases  to  Increase  the  number  of  paths  tra- 
versed. Based  upon  module  function,  size,  position  in  the  module 
hierarchy,  and  path  coverage  from  initial  data,  six  modules  were  selec- 
ted as  retesting  targets.  The  expanded  test  data  set  resulted  from 
using  path  testing  techniques  to  modify  the  Initial  data  set.  Using 
the  expanded  test  data,  path  coverage  for  the  six  modules  rose  from  44 
percent  (with  initial  test  data)  to  75  percent. 

3.2.1  Instrumentation  Techniques 

Instrumenting  a test  program  using  SQLAB  causes  software  probes 
to  be  Inserted  in  the  program  to  trace  its  execution.  The  program  is 
then  run  with  a test  data  set  and  a trace  file  is  produced.  The  trace 
file  is  automatically  analyzed  and  a path  coverage  report  is  printed 
for  each  module,  as  shown  in  Fig.  3.1.  Program  paths  which  have  not 
been  exercised  by  the  test  data  are  flagged  in  this  report.  It  is  then 
up  to  the  tester  to  determine  the  conditions  that  cause  these  paths  to 
be  traversed  and  to  devise  appropriate  test  data. 

Executing  the  complete  Instrumented  program  resulted  in  the  path 
coverage  information  listed  in  Table  3.1.  Path  testing  computer  time 
(on  the  CDC  7600)  for  the  complete  test  object  was  as  follows  (in 


seconds) : 

Instrumentation  30 

Compilation  of  Instrumented  source  11 

Loading  object  file  1 

Execution  using  initial  data  39 

Coverage  analysis  21 

Total  102  seconds 
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PATH  COVERAGE  OF  TEST  OBJECT  USING  INITIAL  DATA  SET 
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X 


KOrNYERG  20  46  43  X:3AG 1 1 ^ 

LSKIP  7 7 100 

Totals  645  1,283  50 


3.2.2  Retesting  Strategy 

The  task  of  Increasing  path  coverage  Is  easily  subdivided  on  a 
per-raodule  basis.  Several  of  SQLAB's  documentation  reports  provide 
additional  Information  for  managing  the  testing  activity.  For  example, 
the  wrap-up  report,  shown  In  Fig.  3.2,  lists  the  number  of  statements 
and  the  number  of  paths  In  each  module.  The  Invocation  bands  reports 
show  module  dependencies  and  the  calling  structure  of  the  program  which 
are  also  helpful.  These  reports  can  be  generated  for  each  module  In  the  y 

system.  One  Is  shown  In  Fig.  3.3. 

Choosing  test  targets  for  expanding  a data  set  should  be  based 
on  software  function,  location  In  the  module  hierarchy,  path  coverage 
derived  from  existing  data,  and  other  factors  particular  to  the  test 
object.  A general  path  testing-based  methodolody  Is  given  In  the  JAVS 
User's  Guide. ^ 

For  this  test  object,  eight  of  the  largest  (In  terms  of  FORTRAN 
statements)  and  highest  level  (In  terms  of  module  control  hierarchy) 
were  selected  as  targets.  These  modules  are  the  starred  and  checked 
modules  In  Table  3.1.  Using  the  Initial  data  set,  most  of  the 
modules  had  fairly  low  path  coverage.  It  was  found  by  Inspection  that, 
due  to  the  data  passed  to  them  by  the  main  program,  modules  ORIO  and 
STOUT  would  never  achieve  much  higher  path  coverage  unless  they  were 
removed  from  the  test  object  environment  and  driven  separately.  Thus 
they  were  omitted  as  test  targets  for  the  purpose  of  expanding  the  data 
set. 

Path  coverage  of  the  remaining  six  modules  was  used  as  a basis 
for  data  set  expansion.  Several  additional  data  sets  were  derived, 
and  the  resulting  path  coverage  Is  shown  In  Table  3.2. 


^C.  Gannon  and  N.  B.  Brooks,  JAVS  Technical  Report.  Vol.  1;  User's 
Guide.  General  Research  Corporation  rR-1-722/1,  June  1978 
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Figure  3.2.  SQLAB  Wrap-up  Report 
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Figure  3.8.  SQLAB  Invocation  Bands  Report 
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TABLE  3.2 

PATH  COVERAGE  OF  SELECTED  MODULES 
USING  EXPANDED  DATA  SET 


Module 

Expanded  Data 
Paths  Hit 

Total  Paths 

Initial  Data 
% Coverage 

Expanded  Data 
% Coverage 

FLIGHT 

22 

27 

68 

81 

FLIN 

50 

89 

49 

56 

HEAD 

53 

61 

41 

87 

0RB2 

38 

62 

16 

61 

PREDATA 

94 

99 

26 

95 

STIN 

31 

47 

49 

66 

288 

385 

44% 

75% 

The  first  module  for  which  new  data  was  created  was  the  data  mani- 
pulation routine  PREDATA.  The  coverage  for  this  module  was  increased 
from  26  percent  to  95  percent  by  adding  two  additional  test  cases  to 
the  original  data  set.  This  module  is  the  largest  TRAID  routine  (250 
statements,  99  paths),  and  it  was  clear  that  it  had  not  been  very 
thoroughly  exercised  by  the  original  data  set.  Tlve  results  were  less 
dramatic  for  other  modules.  The  coverage  for  the  routine  that  controls 
the  guided  missile  flight  was  Increased  only  3.2  percent,  from  68.2  to 
71.4  percent.  Coverage  of  subordinate  modules,  however,  was  signifi- 
cantly Increased. 

Finally,  it  should  be  noted  that  a number  of  program  segments  could 
not  be  reached  by  changing  the  input  data.  Many  of  the  TRAID  routines 
are  general  in  purpose  but  are  only  used  in  a specific  mode  or  for  a 
specific  feature.  For  example,  10  of  the  63  paths  in  the  flight  control 
routine  were  found  to  be  unreachable  because  of  the  main  program  const- 
ruction. Other  paths  which  lead  to  abnormal  program  termination  were 
checked  manually  and  are  intentionally  avoided  during  Instrumented  test 
runs.  Path  coverage  results  must,  therefore,  be  interpreted  carefully. 
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4 ERROR  SEEDING 

In  generating  errors  in  the  test  software  several  considerations 
were  found  appropriate: 

1.  To  be  realistic,  the  errors  should  be  representative  of 
those  found  in  large  programs  in  both  type  and  frequency 
of  occurrence. 

2.  The  error  types  must  be  applicable  to  the  test  software 
and  the  test  environment. 

3.  To  evaluate  test  tools  which  utilize  program  execution, 
one  or  more  errors  should  lead  to  abnormal  program  be- 
havior for  at  least  some  test  data. 

The  following  subsections  describe  how  error  types  were  selected  and  their 
frequency  determined,  demonstrate  how  these  criteria  were  applied  to 
the  test  software  in  generating  errors,  and  present  the  results  of  exe- 
cuting the  software  with  seeded  errors. 

4.1  ERROR  TYPES  AND  FREQUENCY 

1-3 

Several  studies  have  reported  on  the  kinds  and  numbers  of 
errors  found  in  real-time  programs.  Of  these,  the  data  in  the  TRW  study 
are  directly  applicable  to  the  error-seeding  experiment.  We  have  used 
the  Project  5 data  from  that  work  as  the  basis  for  the  error  types  and 
their  corresponding  frequencies  of  occurrence. 


^T.  A.  Thayer  et.al.  Software  Reliability  Study.  TRW  Defense  and  Space 
Systems  Group,  RADC-TR-76-238,  Redondo  Beach,  California,  August  1976. 

2 

N.  J.  Fries,  Software  Error  Data  Acquisition.  Boeing  Aerospace 
Company,  RADC-TR-77-130,  Seattle,  Washington,  April  1977. 

3 

Verification  and  Validation  for  Terminal  Defense  Program  Software: 

The  Development  of  a Software  Error  Theory  to  Classify  and  Detect 
Software  Errors,  Logicon  HR-74012,  May  1974. 
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(1)  There  are  several  factors  which  limited  the  types  of  errors 
which  were  used  for  the  experiment.  The  experiment  was  conducted  on 
the  existing  software  whose  system  requirements  are  not  documented. 

(2)  In  that  there  is  no  time-critical  or  interactive  requirement,  the 
test  software  itself  lacks  certain  characteristics  of  real-time  pro- 
grams. Rather  the  test  environment  has  the  test  software  executing  as 
a normal  batch  job.  (3)  During  path  testing,  certain  test  tool  soft- 
ware is  executed  in  conjunction  with  the  test  object  software  with 
added  overhead.  (4)  The  purpose  of  the  experiment  is  to  evaluate  the 
use  of  test  tools  in  locating  errors  in  programs  (not  errors  in  specifi- 
cations or  documentation).  Therefore,  error  types  related  to  require- 
ments, real-time  performance,  interactive  usage,  operating  system  inter- 
face, and  software  developmental  procedures  were  not  considered 
relevant  to  the  experiment. 

The  project  5 data  is  based  on  a list  of  79  error  types  shown  in 
Table  4.1  grouped  into  twelve  categories.  In  the  TRW  study  only  cate- 
gories A through  H and  J resulted  in  code  changes  to  the  software.  For 
the  experiment,  category  J and  error  types  D500,  D700,  D800,  F400,  F500, 
and  F600  are  not  applicable  to  the  test  software  and  the  test  environment. 

The  first  three  columns  of  Table  4.2  contain  error  frequency  data 
from  Project  5.  Listed  for  each  major  category  (categories  C and  E were 
combined)  are  the  number  of  errors  resulting  in  code  changes  and  the 
percent  of  total  errors.  Since  category  J is  not  applicable  to  the  ex- 
periment, the  percentages  have  been  adjusted  to  those  listed  in  column 
5.  In  generating  errors  for  the  experiment,  the  applicable  percentages 
were  used  as  a goal  for  each  major  category.  Column  6 lists  the  number 
of  errors  actually  generated  for  the  experiment  and  column  7 lists  the 
number  of  errors  which  exhibited  abnormal  program  behavior  in  the  out- 
put from  the  test  software  when  a single  error  was  present. 
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Table  4.1.  Error  Types  Used  In  Experiment 


I 

I 


PROJECT  5 ERROR  CATEGORIES* 

Applicable  to 
Experiment 

A_000 

COMPUTATIONAL  ERRORS 

/ 

A 100 

Incorrect  operand  In  equation 

/ 

A 200 

Incorrect  use  of  parenthesis 

/ 

A 300 

Sign  convention  error 

/ 

A 400 

Units  or  data  conversion  error 

/ 

A 500 

Computation  produces  over/under  flow 

/ 

A_600 

Incorrect/inaccurate  equation  used/wrong 
sequence 

/ 

A 700 

Precision  loss  due  to  mixed  mode 

/ 

A 800 

Missing  computation 

/ 

A_900 

Rounding  or  truncation  error 

/ 

B_000 

LOGIC  ERRORS 

/ 

B 100 

Incorrect  operand  in  logical  expression 

/ 

B 200 

Logic  activities  out  of  sequence 

/ 

B 300 

Wrong  variable  being  checked 

/ 

B 400 

Missing  logic  or  condition  tests 

/ 

B 500 

Too  many/few  statements  in  loop 

/ 

B_600 

Loop  Iterated  incorrect  number  of  times 
(Including  endless  loop) 

/ 

B_700 

Duplicate  logic 

. / 

C_000 

DATA  INPUT  ERRORS 

/ 

C 100 

Invalid  input  read  from  correct  data  file 

/ 

C 200 

Input  read  from  Incorrect  data  file 

/ 

C 300 

Incorrect  input  format 

/ 

C 400 

Incorrect  format  statement  referenced 

/ 

C-500 

End  of  file  encountered  prematurely 

/ 

C_600 

End  of  file  missing 

/ 

D_000 

DATA  HANDLING  ERRORS 

/ 

D 050 

Data  file  not  rewound  before  reading 

/ 

D 100 

Data  initialization  not  done 

/ 

D 200 

Data  initialization  done  Improperly 

/ 

D 300 

Variable  used  as  a flag  or  index  not  set 
properly 

/ 

D 400 

Variable  referred  to  by  the  wrong  name 

/ 

D 500 

Bit  manipulation  done  Incorrectly 

D 600 

Incorrect  variable  type 

/ 

D 700 

Data  packing/unpacking  error 

D 800 

Sort  error 

D 900 

Subscripting  error 

/ 

*From  Table  3-2  of  TRW  Study 
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Table  4.1.  (Cont'd) 


PROJECT  5 ERROR  CATEGORIES  * 

E_000 

— . 

DATA  OUTPUT  ERRORS 

E 100 

Data  written  on  wrong  file 

E_200 

Data  written  according  to  the  wrong  format 

statement 

E 300 

Data  written  in  wrong  format 

E 400 

Data  written  with  wrong  carriage  control 

E_500 

Incomplete  or  mslsing  output 

E 600 

Output  field  size  too  small 

E 700 

Line  count  or  page  eject  problem 

E_800 

Output  garbled  or  misleading 

F_000 

INTERFACE  ERRORS 

F 100 

Wrong  subroutine  called 

F_200 

Call  to  subroutine  not  made  or  made  in 

wrong  place 

F_300 

Subroutine  arguments  not  consistent  in 

type,  units,  order,  etc. 

F 400 

Subroutine  called  is  nonexistent 

F 500 

Software/data  base  interface  error 

F 600 

Software  user  Interface  error 

F_700 

Software/software  interface  error 

G_000 

DATA  DEFINITION  ERRORS 

G 100 

Data  not  properly  defined/dimensioned 

G 200 

Data  referenced  out  of  bounds 

G 300 

Data  being  referenced  at  Incorrect  location 

G_400 

Data  pointers  not  Incremented  properly 

H_000 

DATA  BASE  ERRORS 

H 100 

Data  not  initialized  in  data  base 

H 200 

Data  initialized  to  incorrect  value 

H_300 

Data  units  are  Incorrect 

I_000 

OPERATION  ERRORS 

I 100 

Operating  system  error  (vendor  supplied) 

I 200 

Hardware  error 

I 300 

Operator  error 

I 400 

Test  execution  error 

I 500 

User  misunderstanding/error 

I 600 

Configuration  control  error 

Applicable  to 
Experiment 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 


/ 

/ 

/ 

/ 


/ 


/ 

/ 

/ 

/ 

/ 


/ 

/ 

/ 

/ 


From  Table  3-2  of  TRW  Study 
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Table  4,1.  (Cont’d) 


I 


I 


PROJECT  5 ERROR  CATEGORIES* 

Applicable  to 
Experiment 

J_00C 

OTHER 

J 100 

Time  limit  exceeded 

J 200 

Core  storage  limit  exceeded 

J 300 

Output  line  limit  exceeded 

J 400 

Compilation  error 

J 500 

Code  or  design  inefficient/not  necessary 

J 600 

User/programmer  requested  enhancement 

J 700 

Design  nonresponsive  to  requirements 

J 800 

Code  delivery  or  redelivery 

J_900 

Software  not  compatible  with  project 
standards 

K_000 

DOCUMENTATION  ERRORS 

K 100 

User  manual 

K 200 

Interface  specification 

K 300 

Design  specification 

K 400 

Requirements  specification 

K_500 

Test  documentation 

XOOOO 

PROBLEM  REPORT  REJECTION 

XOOOl 

No  problem 

X0002 

Void/withdrawn 

X0003 

Out  of  scope  - not  part  of  approved  design 

X0004 

Duplicates  another  problem  report 

X0005 

Deferred 

From  Table  3-2  of  TRW  Study 
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4.2  ERROR  GENERATION 


In  addition  to  generating  errors  whose  type  and  frequency  have 
their  bases  in  a published  study,  the  location  of  each  error  and  the 
program's  resulting  behavior  were  also  prime  concerns  in  maintaining  an 
objective  experiment.  In  the  TRW  study,  no  data  linking  the  error  type 
to  software  property  (e.g.,  statement  type)  is  presented.  Using  the 
error  type  made  it  necessary  to  establish  correlations  between  each 
error  type  and  quantifiable  test  software  properties.  Furthermore, 
since  the  test  object  consists  primarily  of  general  utility  subroutines, 
many  having  alternative  segments  of  code  whose  execution  depends  upon 
their  input  parameter  data,  we  felt  that  the  errors  should  reside  on 
segments  of  code  that  are  executed  by  a thorough  (in  terms  of  program 
function  and  structure)  set  of  test  data,  and  that  the  errors  should 
manifest  themselves  by  some  deviation  in  the  program's  normal  output. 

To  generate  errors  with  these  properties,  the  following  steps  were  per- 
formed. 

1.  The  test  software  was  analyzed  by  the  test  tool^  to  classify 
source  statements,  to  obtain  software  documentation  refer- 
ence material  (e.g.,  symbol  set/usage,  module  Interaction 
hierarchy,  location  of  all  invocations),  to  guide  insertion 
of  errors,  and  to  generate  an  expanded  set  of  test  data  that 
provided  thorough  path  coverage.  The  percentage  of  path 
coverage  varied  from  module  to  module  depending  upon  the 
main  program's  application  of  the  utility  subroutines. 

2.  A matrix  showing  error  types  versus  statement  classification 
was  manually  derived. 

3.  The  information  from  steps  1 and  2 was  combined  into  a 
matrix  showing  potential  sites  in  the  software  for  each 
error  type. 


^D.  M.  Andrews  and  J.  P.  Benson,  Software  Quality  Laboratory  User's 
Manual,  General  Research  Corporation  CR-4-770,  May  1978. 
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4.  From  the  potential  site  matrix,  a list  of  candidate  error 
sites  was  randomly  generated. 

5.  At  each  site  in  the  list  either  an  error  of  the  designated 
type  was  manually  inserted  or  the  site  was  rejected  as  being 
unsuitable  for  the  error  type. 

6.  Errors  were  eliminated  from  the  error  set  which  caused  a 
compiler  or  loader  diagnostic. 

7.  The  86  errors  shown  in  column  6 of  Table  4.2  were  selected 
from  the  remaining  errors  using  Project  5 error  frequency 
data.  Errors  from  this  set  were  eliminated  if  they  caused 
no  change  in  the  output.  Fifteen  errors  were  rejected  due 
to  lack  of  coverage  with  the  test  data,  and  22  were  elimi- 
nated for  which  coverage  was  achieved  without  affecting  the 
output.  The  surviving  49  errors,  shown  in  column  7,  were 
used . 

Error  site  execution  or  reference  was  verified  by  an  output  mes- 
sage placed,  for  the  case  of  executable  statements,  at  the  error  site 
or,  for  the  case  of  non-executable  statements,  at  the  site  of  reference 
by  some  executable  statement  on  a covered  path.  The  impact  of  this 
evidence  is  that  path  testing  with  the  sole  goal  of  execution  coverage 
is  not  an  adequate  verification  measure.  Most  software  tool  developers 
whose  verification  tools  Include  a path  testing  capability  advocate 
their  usage  with  data  that  demonstrate  all  specific  functions  of  the 
software.  Even  then,  stress  and  other  performance  testing  should  enter 
into  the  total  test  plan. 

4.2.1  Error  Seeding  Preliminary  Analysis 

Using  the  SQLAB  tools,  the  original  test  software  was  processed 
to  generate  standard  documentation  and  static  analysis  reports.  The 
reports  include  the  following: 


1.  A list  of  the  software  properties  of  each  module  with  a 
count  of  each  statement  type  and  the  characteristics  of 
the  interface 

2.  A listing  of  the  source  for  each  module  in  the  test  software 

3.  Source  for  all  invocations  to  and  from  each  module 

4.  Local  and  global  cross  reference  lists  indicating  usage  for 
all  names 

5.  Path  identification  for  each  DD-path  in  each  module 

6.  Hierarchy  relationships  between  modules 

7.  Static  checks  on  variable  usage. 

A master  list  of  test  software  properties  was  constructed  from  item  1 
and  the  linkage  established  between  each  software  property  and  the 
error  types.  The  linkages  together  with  the  data  for  each  module  were 
used  to  select  candidate  error  sites.  The  other  reports  were  used  to 
generate  actual  errors.  The  following  subsections  explain  how  this  was 
accomplished. 

4.2.2  Software  Property  and  Error  Type  Linkage 

The  master  list  of  software  properties  constructed  from  item  1 
(see  Table  4.3)  reflects  the  dialect  of  FORTRAN  used  (e.g.,  DECODE  and 
ENCODE)  and  the  statement  types  used  in  the  test  software  (e.g.,  no 
DOUBLE  PRECISION  or  PUNCH  statements).  Additionally,  the  list  includes 
only  those  statement  types  relevant  to  the  experiment  (e.g.,  comment 
statements  and  END  statements  are  omitted).  Two  interface  properties 
are  Included,  Parameter  and  Invocation,  although  there  is  some  overlap 
with  other  constructs.  The  linkage  between  software  properties  and 
error  types  was  established  by  listing,  for  each  error  type,  all  software 
properties  that  could  be  the  site  of  an  error  of  that  type.  These 
linkages  are  shown  in  Table  4.3. 
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RELATIONSHIPS  BETWEEN  SOFTWARE  PROPERTIES  AND  ERROR  TYPES 


The  linkages  were  manually  generated.  In  some  instances,  syntactic 
and  semantic  rules  for  FORTRAN  were  used  to  determine  entries.  For  exam- 
ple, any  statement  type  in  which  an  arithmetic  expression  is  permitted 
(e.g..  Assignment,  CALL,  IF)  is  a possible  site  for  an  error  in  the  com- 
putational category  (error  types  AlOO  through  A900) . Similarly,  FORMAT, 

READ,  PRINT,  WRITE,  DECODE,  and  ENCODE  statements  are  possible  sites 
for  input  and  output  error  types  in  categories  C and  E involving  data 
conversion. 

Other  entries  in  the  table  indicate  statement  types  which  are 
directly  associated  with  an  error  type,  although  an  error  may  involve  a 
combination  or  sequence  of  statements  including  other  types  not  marked. 

For  example,  error  type  B500  (too  many/few  statements  in  loop)  is  dir- 
ectly associated  with  a DO  statement  (marked  in  the  table)  combined  with 
at  least  one  of  any  other  executable  statement  (not  marked). 

In  some  Instances,  entries  reflect  how  the  test  software  proces- 
sing is  accomplished,  although  it  may  not  be  a common  usage  of  the  lan- 
guage. An  example  of  this  is  the  usage  of  assignment  statements  to 
construct  variable  formats,  thereby  linking  the  Assignment  Statement  i 

type  to  error  types  C300  (incorrect  input  format)  and  E300  (Data  written 

i 

in  wrong  format). 

4.2.3  Candidate  Error  Site  Selection 

The  method  used  for  error  selection  attempts  to  be  realistic  by 
utilizing  published  error  types  and  frequencies  (Table  4.2)  while  remain- 
ing objective  by  selecting  placement  by  random.  The  test  software  con- 
tains over  50  modules.  For  each  module,  data  showing  the  count  of  each 
software  property  was  collected  from  SQLAB  reports  (see  Sec.  4.2.1)  into 
a matrix  of  the  form  shown  in  Fig.  4.1:  This  matrix,  when  combined 
with  the  matrix  linking  software  property  to  error  type  (Table  4.3) 
yields  a matrix  of  candidate  error  sites  for  each  error  type  in  each 
module.  The  form  of  the  candidate  error  site  matrix  as  shown  in  Fig. 

4.2  is  sub-divided  according  to  major  error  category. 
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For  each  major  error  category  a randomly  selected  list  of  candi- 
date error  sites  was  generated  using  a simple  computer  program  to  per- 
form the  necessary  computations  for  site  selection.  Input  to  the  program 
consisted  of  the  following  data: 

1.  Error  Category  and  Error  Type  List  (Table  4.1). 

2.  A list  of  Software  Property  Names  (See  Table  4.3). 

3.  A list  of  Module  Names  (from  SQLAB  reports). 

4.  Error  Category  Frequency  (Col.  5 of  Table  4.2). 

5.  Software  Property  and  Error  Type  Linkages  (Table  4.3). 

6.  Software  Property  and  Module  Matrix  (from  SQLAB  reports). 

7.  The  number  of  error  sites  to  generate. 

8.  Possible  causes  for  each  error  type  (statement  sequence 
omitted  or  extra  statement,  input  data) 

The  site  selection  program  contains  no  algorithms  to  reject  a selected 
site  which  is  unsuitable  for  a particular  error  type  (e.g.,  an  assign- 
ment statement  without  any  parenthesis  for  error  type  A200,  Incorrect 
use  of  parenthesis).  To  provide  for  manual  rejection  of  unsuitable 
sites,  the  number  of  sites  was  chosen  to  be  twenty  times  the  targeted 
number  (50)  for  the  experiment,  or  1000  sites. 

Output  from  the  program  consists  of  a list  of  the  randomly  selec- 
ted candidate  error  sites  for  each  major  category.  The  number  of  sites 
generated  for  each  category  is  proportional  to  the  error  frequency  for 
the  category,  with  the  total  number  of  sites  equal  to  the  desired  number. 
The  output  for  each  candidate  site  identifies  the  site  by  module  name, 
software  property,  the  property's  sequence  number  within  the  module,  and 
the  error  type  with  its  description.  In  addition,  the  possible  causes 
for  the  error  type  are  listed.  Fig.  4.3  contains  an  excerpt  reproduced 
from  the  output  for  major  error  category  A,  Computational.  How  this 
list  was  used  to  generate  errors  is  explained  in  the  following  sub- 
section. 
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CANDIDATE  SITES  FOR  AOOO  COMPUTATIONAL 
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4.2.4  Error  Set  Generation 

The  task  of  generating  a representative  set  of  errors  for  the 
experiment  consisted  of  three  major  steps: 

Step  1.  Using  the  candidate  error  site  list  as  a guideline,  a 

set  of  error  packets  was  created  which  contained  a sel- 
ection of  errors  in  the  desired  frequency  for  each  of  the 
major  error  categories. 

Step  2.  Error  packets  resulting  in  compiler  or  loader  error 
messages  or  warnings  were  eliminated  from  the  set. 

Step  3.  The  acceptable  error  packets  were  applied,  one  at  a time, 
to  the  source  program  and  the  results  of  e'^ecuting  the 
erroneous  program  analyzed  and  classified  for  later  use 
in  the  experiment. 

These  three  steps  were  repeated  one  time  to  obtain  the  final  set 
of  error  packets  used  in  the  experiment. 

Error  Packets 

Step  1 in  this  process  was  performed  by  repeating  for  each  major 
error  category  the  following  sequence  until  the  desired  number  of  errors 
were  generated: 

1.  Choose  the  next  (initially,  the  first)  site  in  the 
candidate  site  list  (See  Fig.  4.3). 

2.  Locate  the  site  in  the  source  program  listing  (e.g.,  the 
seventh  assignment  statement  in  STREP).  Reject  site  if 
previously  accepted;  otherwise,  continue. 

3.  Determine  if  error  type  is  applicable  to  site  (e.g..  Would 
a change  in  operand  be  a likely  error  in  the  statement?). 
If  not,  reject  site;  otherwise,  continue. 
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4. 


If  site  Is  an  executable  statement,  determine  whether 
statement  was  executed  with  test  data  using  coverage  re- 
ports from  test  software  coverage  analysis  activity 

(Sec.  4. 3. 2. 2).  If  site  is  a declaration  statement,  deter-  ' 

mine,  if  possible,  whether  information  in  declaration  was 
referenced  by  using  coverage  reports.  Accept  site  and 
continue  if  criteria  met;  otherwise,  reject  site. 

5.  Generate  error  packet  for  acceptable  site  and  mark  site  ! 

to  avoid  duplication. 

Each  error  packet  includes  the  following  items: 

1.  A unique,  randomly  selected  packet  Identification  name. 

2.  A code  change  constituting  the  error. 

3.  A print  message  identifying  on  the  output  the  error  by 
packet  identification  name  (added  as  the  first  executable 
statement  of  the  main  program) . 

4.  A comment  statement  identifying  the  error  site  and  type 
(added  at  the  error  site). 

5.  A print  message  to  record  when  the  module  containing  the 
error  is  entered  (added  as  the  first  executable  statement 
of  the  module  entry) . 

6.  A print  message  to  record  when  the  error  site  is  executed 
(added  at  the  error  site  or  at  the  location  where  the 
error  is  effective). 

An  example  error  packet  is  shown  in  Fig.  4.4.  The  system  utility 
UPDATE  was  used  to  manage  the  error  packets.  Each  item  consists  of  one 
or  more  UPDATE  directives  (first  character  is  *)  and  FORTRAN  source  text. 

The  UPDATE  directive  serves  to  Identify  the  packet  (*ID)  or  to  insert 
test  (*I)  or  delete  and  Insert  text  (*D)  at  a designated  place  in  the 
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test  software.  The  set  of  error  packets  was  placed  in  ascending  order 

I 

by  the  (randomly  selected)  packet  names  before  input  to  the  UPDATE 

^ utility. 

► 

The  complete  error  packet  was  used  to  analyze  the  effect  of  the 
presence  of  each  error  prior  to  use  in  the  experiment.  For  the  experi- 
ment only  the  first  three  items  in  each  packet  were  used  to  modify  the 
software.  One  or  more  error  packets  were  selected,  then  the  source  of 
the  complete  program  including  the  errors  was  made  available  to  the 
tester  in  a form  which  concealed  the  site  and  type  of  error  (See 
Secs.  5 and  6. 

Compiler  and  Loader  Qualification 

Step  2 in  the  error  set  generation  process  served  to  eliminate 
from  the  error  sat  those  errors  which  were  revealed  by  the  compiler  or 
loader.  The  complete  set  of  error  packets  was  applied  to  the  source 
program  and  the  erroneous  source  compiled  and  executed.  Error  packets 
which  resulted  in  compiler  or  loader  error  messages  or  warnings  were 
eliminated  from  the  set.  A warning  of  an  unset  variable  is  an  example 
of  a compiler  message  which  caused  rejection  of  an  error  packet; 
similarly,  an  unsatisfied  external  warning  by  the  loader  caused  re- 
jection. 

Error  Analysis 

Step  3 was  to  analyze  the  effect  of  the  presence  of  each  error 
during  execution.  The  test  software,  with  one  error  jacket  applied, 
was  compiled  and  executed  with  the  sample  test  data  obtained  from  prelim- 
inary coverage  analysis.  The  output  was  then  examined  for  print  messages 
from  items  5 and  6 of  the  error  packet.  In  addition,  comparisons  were 
made  to  normal  program  output  obtained  by  executing  the  error-free  test 
software  with  the  same  data.  The  results  of  each  error  run  were 
classified  in  one  of  the  following  categories: 

I 
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Figure  4.5.  Selected  Entries  from  Results  List 


1.  No  observed  effect  on  normal  output. 

2.  Module  containing  error  executed  with  no  observed  effect  on 
normal  output. 

3.  Module  containing  error  executed  and  error  site  executed 
with  no  observed  effect  on  normal  output. 

4.  Module  containing  error  executed  and  error  site  executed 
with  error  manifested  by  differences  in  error  run  output 
from  normal  output. 

Errors  in  category  4 were  used  in  path  testing  portion  of  the  experiment. 
Errors  in  all  categories  were  used  in  other  portions  of  the  experiment. 

[ 

For  errors  used  in  path  testing,  a "user  complaint"  about  the  erro- 
neous output  was  prepared.  The  output  problems  Included  not  only  prema- 
ture termination  of  program  execution,  but  also  discrepancies  in  user- 
expected  program  behavior,  output  format,  and  numeric  results.  Selected 
entries  from  a list  of  error  packet  names  and  results,  prepared  for  use 
in  the  experiments,  are  shown  in  Fig.  4.5. 

A total  of  86  errors  were  generated;  of  these,  49  errors  were  mani- 
fested by  erroneous  output.  A breakdown  by  error  type  is  shown  in  Table 
4.4.  These  are  also  summarized  by  major  error  category  in  Table  4.2 
together  with  the  error  frequency  data. 

Table  4.5  shows  the  distribution  of  errors  by  software  property 
and  major  error  category;  the  total  occurrences  of  the  software  property 
in  the  test  software  is  also  shown.  Each  non-blank  entry  represents  a 
statement  property  linked  to  a major  error  category.  Each  non-zero 
entry  is  the  count  of  error  packets  generated  or  manifested  in  the  output. 

Table  4.6  shows  the  distribution  of  errors  by  count  of  error  pack- 
ets in  single  module  and  cumulative  error  run  results.  Of  57  modules  in 
the  test  software,  86  error  packets  were  generated  in  33  modules.  Ten 
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TABLE  4.4 

ERROR  RUN  RESULTS  BY  ERROR  TYPE 


CateRorv 

Error 

Packets 

Generated 

Errors 
Manifested 
in  Output 

Category 

Error 

Packets 

Generated 

Errors 
Manifestet 
in  Output 

A,  Computational 

D. 

Data  Handling 

AlOO 

2 

2 

D050 

1 

1 

A200 

3 

2 

DlOO 

1 

1 

A300 

1 

0 

D200 

4 

2 

A400 

1 

1 

D300 

A500 

2 

0 

D400 

3 

2 

A600 

2 

0 

D600 

A700 

D900 

1 

1 

A800 

3 

2 

10 

7 

A900 

F. 

Interface 

14 

7 

FlOO 

B.  Logic 

F200 

4 

2 

BlOO 

2 

1 

F300 

B200 

7 

1 

F700 

3 

2 

B300 

3 

3 

7 

4 

B400 

4 

3 

B500 

5 

4 

G. 

Data  Definition 

B600 

2 

1 

GlOO 

2 

2 

B700 

O 

0 

G200 

2 

2 

25 

13 

G300 

2 

0 

G400 

2 

0_ 

^E.  Input/Output 

8 

4 

ClOO 

C200 

2 

2 

H. 

Data  Base 

C300 

HlOO 

3 

2 

C400 

H200 

5 

2 

C500 

H300 

5 

4 

C600 

13 

8 

ElOO 

E200 

1 

1 

E300 

1 

1 

8f> 

49 

E400 

E500 

1 

1 

E600 

4 

1 

E700 

9 


6 
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TABLE  4.6 
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modules  had  only  one  packet  and  no  modules  had  more  than  eight.  During 
single-error  runs,  modules  containing  84  of  the  86  errors  were  executed 
In  31  of  the  33  error-seeded  modules  (two  were  contained  in  error- 
recovery  routines  not  executed  for  the  sample  test  data).  The  error 
site  was  executed  for  71  of  the  86  errors  in  29  modules;  but  the  error 
was  manifested  by  the  output  in  only  49  of  the  86  error  runs  in  25 
modules . 

Note  the  large  drop  (22)  in  the  number  of  errors  manifested  in 
output  from  the  number  whose  site  was  executed  (49  from  71).  It  is 
not  uncommon  for  software  containing  errors  to  produce  the  "right"  out- 
put even  if  the  site  of  the  error  is  executed.  Upon  analysis,  these 
errors,  although  potentially  dangerous,  proved  to  be  harmless  in  the 
test  environment.  For  example,  one  caused  calculations  to  be  needless- 
ly repeated,  another  preset  data  which  was  later  reset  before  being 
used,  and  a third  performed  calculations  whose  results  were  never  used. 
All  three  of  these  errors  were  time-consuming  errors  which  could  affect 
real-time  responses.  Table  4.7  lists  the  reasons  these  22  errors  re- 
sulted in  acceptable  output. 


TABLE  4.7 

CATEGORY  3 ERRORS  (SITE  EXECUTED) 


Reason  Error  Not  Manifested 

Variable  value (s)  acceptable 

Variable  reset  before  use  on  path  taken 

Loop  executed  only  once 

Statement  sequence  has  no  effect 
for  path  taken 

Timing  not  critical 

Variable  not  used  after  set 

Input  data  complete 


Number  of  Errors 


8 

5 

3 

2 

2 

1 

_1 

22 
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SINGLE- ERROR  EXPERIMENT 


5.1  DESCRIPTION 

Errors  from  the  seven  major  categories  were  seeded,  one  at  a 
time,  into  the  FORTRAN  program  according  to  the  frequencies  shown  in 
Table  5.1  and  Fig.  5.1.  For  each  error  the  analyst  was  given  a compi- 
lation and  execution  listing  which  gave  no  clues  to  the  error's  loca- 

tion. He  was  told  what  was  wrong  with  the  output  and  had,  as  a specifi- 
cation of  the  proper  program  performance,  a listing  of  the  correct  output. 
The  task  was  to  find  the  error  using  execution  coverage  analysis  (path 
testing)  or  Inspection,  whichever  seemed  more  appropriate,  correct  the 
source,  and  execute  the  corrected  program  to  verify  the  output.  Human 

and  computer  times  were  accounted  for  from  the  time  the  tester  re- 

ceived the  erroneous  listing  to  the  time  he  delivered  the  corrected 
listing. 

To  evaluate  the  types  of  errors  detected  by  static  analysis,  all 
49  errors  were  simultaneously  seeded  into  the  program  after  determining 
that  they  did  not  interfere  with  each  other  in  the  static  sense.  Only 
one  computer  run  was  required  for  this  evaluation. 

Unlike  static  analysis,  which  explicitly  detects  inconsistencies 
and  locates  the  offending  statement (s) , path  testing  is  a technique 
that  demands  skill  to  Interpret  the  execution  coverage  data  as  well  as 
to  recognize  improper  program  performance  from  the  program's  output. 

5.2  PATH-TESTING  PHASE 

For  the  path-testing  evaluation  phase,  we  found  that  the  errors 
were  located  using  three  detection  methods:  path  testing  alone, 
inspection  aided  by  path  testing,  and  Inspection  alone.  Some  errors 
were  easily  detected  without  the  necessity  of  instrumenting  the  code 
to  get  path  coverage.  Some  errors  were  found  when  the  path  coverage 
reports  narrowed  the  search  to  a set  of  suspicious  paths — but  then 
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Figure  5.1.  Error  Frequency  in  Major  Categories 
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ERROR  FREQUENCY  IN  MAJOR  CATEGORIES 


Inspection  was  used  to  actually  determine  the  error.  Other  errors  were 
found  directly  by  observing  the  control  path  behavior  from  the  coverage 
reports  and  the  path  statement  definition  listing.  In  a few  cases  the 
wrong  "error"  was  found  and  only  some  of  the  incorrect  symptoms  dis- 
appeared (these  are  noted  in  Table  5.2). 

Figure  5.2  shows  the  frequencies  of  error  categories  detected  by 
the  methods  described  above.  The  dashed  lines  show  the  effect  of  some 
degree  of  path  testing  coverage  by  reporting  the  sum  of  path  testing 
alone  and  inspection  aided  by  path  testing.  As  one  might  expect,  logic 
errors  and  computation  errors  (since  they  often  cause  a change  in  con- 
trol flow)  are  the  best  candidates  for  path  testing.  Errors  in  these 
two  categories  are  often  the  most  difficult  to  locate,  unless  a de- 
tailed design  and  specification  are  also  available.  Input/output  and 
data  definition  errors  are  usually  easily  detected  by  inspection  alone. 

More  comprehensive  results  are  shown  in  Table  5.2.  Note  that  not 
all  minor  error  types  were  seeded  into  the  program,  owing  to  project 
limitations.  For  each  error  seeded.  Table  5.2  shows  the  technique  used 
to  detect  it.  An  asterisk  next  to  the  technique's  indicator  signifies 
that  the  erroneous  statement  was  located  but  the  "correction"  was  not 
the  proper  one,  or  that  more  information  (such  as  a specification)  was 
needed  to  make  the  proper  changes. 

To  assess  the  value  of  path  testing,  an  account  was  kept  of  the 
resources  expended.  The  average  engineering  time  in  hours  for  finding 
each  error  is  shown  in  Fig.  5.3.  Most  of  the  errors  detected  by  in- 
spection required  only  about  11/2  hours  to  find  and  correct.  On  the 
other  hand,  the  more  difficult  errors  requiring  path  testing  took  about 
4 hours. 
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Figure  5.2.  Path  Testing  Frequency  of  Detected  Errors 
By  Category 
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Figure  5.3.  Path  Testing:  Average  Time  Expended  per  Error 


5.3  SIATIC  ANAl.Y'^IS  PIUSi. 


Static  analysis  has  capabilities  for  detecting  infinite  loops, 

. unreachable  code,  uninitialized  variables,  and  Inconsistencies  in 

U variable  and  parameter  mode.  Some  sophisticated  compilers  have  a few 

of  these  capabilities.  In  our  experiment,  static  analysis  detected  16 
percent  (8  errors)  of  the  total  49  seeded  errors.  Figure  5.4  shows  the 
frequency  of  detected  errors  by  major  category,  and  Table  5.2  lists 
each  error  type  found  by  static  analysis.  One  error  detected  by  the 
graph  checking  capability  of  the  static  analyzer  was  unreachable  code 
due  to  a missing  IF  statement.  This  error  (B400)  was  rot  detected  by 
either  path  testing  or  inspection.  Unreachable  code  can  be  very  diffi- 
cult to  locate  in  code  filled  with  statement  labels  and  three-way  IF 
statements  as  was  the  test  object  for  the  experiment.  Unreachable  code 
may  be  harmless  or  it  may  not,  but  it  is  always  a warning  of  possible 
dangers  or  inefficient  use  of  computer  resources. 

Vfhile  static  analysis  did  not  detect  a high  percentage  of  errors, 
and  while  most  of  the  errors  it  did  find  were  also  detected  by  path 
testing,  it  has  the  distinct  advantage  of  being  a very  economical  tool. 
Only  two  engineering  hours  and  24  seconds  of  CDC  7600  time  were  re- 
quired to  review  the  static  analysis  output  and  locate  the  errors.  A 
disadvantage  is  that  if  programming  practice  allows  frequent  inten- 
tional mixed  mode  constructs  or  mismatching  number  of  actual  and  formal 
parameters,  the  static  analyzer  will  issue  frequent  warnings  and 
errors  (133  in  our  experiment)  that  are  harmless  to  the  proper  execu- 
tion of  the  program. 

Both  error-seeding  and  error  detection  activities  of  the  experi- 
ment provided  concrete  data  for  several  conclusions  about  the  two 
testing  techniques.  While  the  experiment  was  designed  and  implemented 
in  an  objective  manner  and  can  be  repeated  by  other  interested  re- 
searchers, it  is  not  our  Intention  to  apply  a metric  or  statistical 
significance  to  the  error  detection  capabilities  of  the  testing  methods. 
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It  is  our  purpose,  however,  to  report  the  types  of  errors  that  can  be 
detected  by  these  techniques.  The  results  of  the  experiment  can  also 
be  used  as  a reference  for  tool  developers  seeking  to  sharpen  their 
tools  for  more  rigorous  error  detection. 
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6.  MULTI-ERROR  EXPERIMENT 

A multi-error  experiment  was  conducted  to  evaluate  the  utility 
of  static  analysis  and  path  testing  under  more  realistic  conditions 
where  several  errors  exist  in  a program.  The  experimental  conditions 
were  designed  to  simulate  a typical  software  testing  environment  in 
which  the  program  can  be  compiled  and  run  but  the  performance  or  out- 
put does  not  meet  specifications. 

There  are  two  aspects  of  the  multiple  error  situation  which  makes 
it  quite  different  from  the  single  error  conditions.  First,  the  actual 
number  of  errors  in  a program  is  not  known.  The  tester  might  try  to 
estimate  the  number  of  expected  errors,  but  will  not  be  sure  of  the 
extent  of  the  testing  task.  Testing  strategies  may  be  adjusted  on  this 
subjective  assessment.  Also,  estimates  of  the  testing  time  required  and 
the  degree  of  testing  completeness  will  be  based  on  this  imperfect  infor- 
mation. 

The  second  aspect  of  multiple  errors  not  found  in  single  error 
conditions  is  the  problem  of  one  error  masking  the  effects  of  another. 

The  syndrome  of  "just  one  more  error"  is  due  at  least  in  part  to  error 
symptoms  which  suddenly  appear  when  an  error  is  corrected.  Many  times  it 
is  difficult  to  determine  whether  latent  errors  are  exposed  or  new  errors 
are  Introduced  when  "correcting"  a suspected  error.  There  is  also  a 
fatigue  factor  or  saturation  limit  on  the  number  of  errors  one  tester 
can  find,  and  this  limit  is  almost  always  less  than  the  actual  number  of 
errors  in  a program. 

6.1  DESCRIPTION  OF  THE  MULTI-ERROR  EXPERIMENT 

The  multi-error  testing  environment  was  established  by  seeding 
the  5000-line  FORTRAN  test  object  (program)  with  22  of  the  errors 
used  in  the  single-error  experiment.  The  error  categories  and  fre- 
quency of  seeded  errors  are  shown  in  Fig.  6.1.  This  collection  of 
errors  was  the  largest  set  which  could  be  Introduced  at  one  time 
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Figure  6.1.  Error  Frequency  in  Major  Categories 
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and  still  have  the  program  run  to  "normal  completion,"  Figure  6.2 
shows  a comparison  of  the  number  of  errors  seeded  with  the  number  of 
errors  found  in  other  delivered  software.  This  graph,  taken  from 
Gannon,^  indicates  that  22  errors  could  be  easily  expected  in  a pro- 
gram of  5000  lines  which  has  been  acceptance-tested. 

Two  testers  analyzed  the  error-seeded  program — one  using  SQLAB 
for  static  analysis  and  path  testing  and  the  other  using  the  debugging 
trace  facility  provided  by  the  compiler.  The  two  testers  worked  inde- 
pendently and  neither  was  involved  with  the  single-error  experiment. 

The  number  of  seeded  errors  was  not  disclosed  to  the  testers. 

Both  testers  were  allowed  the  same  amount  of  time  (120  hours) 
to  conduct  their  tests.  Both  worked  from  the  same  test  object  and 
test  dataset,  and  both  used  the  same  computer  facility.  Both  testers 
were  free  to  use  extended  compiler  reports,  insert  debugging  print 
statements,  and  modify  the  supplied  dataset. 

Activity  reports  were  prepared  as  in  the  single-error  experiment 
to  document  the  error  analysis  and  correction  process.  A log  was  also 
kept  to  help  document  the  sequence  of  actions  taken  in  detecting  errors, 

6.2  RESULTS  OF  THE  MULTI- ERROR  EXPERIMENT 

The  results  of  the  multi-error  experiment  are  difficult  to  inter- 
pret for  a number  of  reasons.  Individual  performance  in  programming 
and  debugging  is  highly  variable,  and  since  only  two  people  partici- 
pated in  this  phase  of  the  project,  statistical  measures  cannot  be 
derived  with  confidence.  There  are,  however,  some  Interesting  com- 
parisons to  be  drawn  from  the  data  collected  and  some  ideas  for 
improving  testing  tools  and  techniques. 


C.  Gannon,  "A  Verification  Case  Study,"  Proceedings  of  AIAA  Computers 
ill  Aerospace  Conference,  Los  Angeles,  November  1977. 
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Figure  6.2.  Errors  in  Delivered  Software 


6 


AN-47553 


The  variation  in  individual  performance  in  programming  and  debug- 
ging was  found  to  range  over  more  than  an  order  of  magnitude  by 

1 2 
Sackman  in  the  early  1960s.  More  recent  experiments  by  Myers 

confirm  this  variability  and  indicate  that  modern  computer  science  has 

not  improved  this  aspect  of  human  fallibility.  From  these  Independent 

results  it  is  surprising  how  closely  the  results  of  the  multi-error 

experiment  compare. 


6.2.1  Error  Detection  Results 

The  results  of  the  multi-error  experiment  are  presented  in  Table 

6.1  which  is  organized  by  error  category.  Of  the  22  seeded  errors,  11 
were  found  by  the  tester  using  the  SQLAB  test  tools  and  15  were  found 
by  the  tester  using  the  debugging  trace  facilities  provided  by  the 
compiler.  Nine  of  the  errors  were  found  by  both  testers.  The  bar  graph 
in  Fig.  6.3  provides  an  overview  of  the  categories  of  errors  detected 

by  each  tester. 

The  information  in  Table  6.1  is  presented  in  another  form  in  Fig. 
6.4,  organized  by  the  order  in  which  the  errors  were  discovered  by 
the  two  testers.  The  horizontal  axis  represents  the  sequence  in  which 
errors  were  found  by  the  tester  using  the  SQLAB  test  tools.  The  vertical 
axis  represents  the  sequence  in  which  errors  were  found  by  the  tester 
using  the  compiler's  debug-trace  facility.  The  error  numbers  and  their 
categories  appear  at  the  coordinate  positions  corresponding  to  when  they 
were  discovered.  For  example,  error  E047  was  the  sixth  error  found  by 
the  SQLAB  tester  and  the  eleventh  error  found  by  the  other  tester.  Hence, 
error  E047  appears  at  coordinate  position  (6,11)  in  the  figure. 


^H.  Sackmann,  Man-Computer  Problem  Solving;  Experimental  Evaluation 
of  Time-Sharing  and  Batch  Processing,  Petrocelll  Books,  1970. 

2 

G.  J.  Myers,  "A  Controlled  Experiment  in  Program  Testing  and  Code 
Walkthrough/Inspections,"  CACM,  Vol.  21,  No.  9,  Sept.  1978. 
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Figure  6,3.  Categories  of  Errors  and  Method  of  Detection 
in  the  Multi-Error  Experiment 
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Four  distinct  classes  of  errors  can  be  derived  from  the  data  in 
Fig.  6.4.  The  four  errors  clustered  in  the  lower  lefthand  corner 
represent  easy  errors  which  are  quickly  and  easily  diagnosed  and  hence 
are  perhaps  not  serious  problems.  The  other  five  errors  found  by  both 
testers  are  somewhat  more  difficult  to  find  and  hence  might  be  classi- 
fied as  moderately  difficult.  The  collection  of  eight  errors  found  by 
one  tester  and  not  the  other  form  a class  of  errors  which  are  more 
difficult  to  diagnose  than  the  errors  found  by  both  testers.  The  last 
five  errors,  which  were  not  discovered  by  either  tester,  represent  a 
class  of  subtle  errors  which  are  likely  to  escape  detection  during 
formal  testing. 

6.2.2  Resources  Expended 

The  resources  in  terms  of  engineering  and  computer  time  used  by 
the  two  multi-error  testers  are  presented  in  Table  6.1.  Only  the  times 
which  could  be  directly  attributed  to  individual  errors  are  recorded  in 
this  table.  Hence,  some  of  the  entries  have  been  left  blank.  Also, 
the  total  times  reported  are  larger  than  the  sums  of  the  individual 
times  recorded  in  each  column. 

The  first  item  to  note  is  the  total  engineering  time  spent  by  the 
two  testers.  Each  tester  was  allotted  120  hours  for  their  task.  The 
SQLAB-based  tester  spent  72  hours;  the  compiler-based  tester  spent 
only  50.  Both  testers  expressed  a feeling  of  having  reached  the  limit 
of  their  effectiveness  in  finding  more  errors.  The  SQLAB-based 
tester  seemed  overwhelmed  by  the  complexity  of  the  mathematics  and  the 
inscrutability  of  the  program.  The  lack  of  specifications  for  the  pro- 
gram and  documentation  from  earlier  testing  efforts  also  contributed. 
The  compiler-based  tester  thought  there  was  probably  only  one  error 
left  in  the  program  (when  in  fact  there  were  seven  more  errors)  but 
felt  it  would  take  an  Inordinate  amount  of  time  to  diagnose. 
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Perhaps  too  much  emphasis  has  been  placed  on  the  testing  tools 
and  not  enough  on  human  factors.  The  psychological  stress  of  testing 
and  debugging  a program  can  be  severe.  Both  testers  found  the  task 
quite  difficult  and  frustrating.  The  satisfaction  of  finding  an  error 
did  not  seem  sufficiently  rewarding  to  stimulate  renewed  efforts.  The 
reward  was  often  the  exposure  of  symptoms  of  more  errors. 

Comparison  of  the  resources  (engineering  time,  computer  time, 
etc.)  used  by  each  of  the  testers  shows  no  statistically  significant 
differences  based  on  Sackmann's  and  Myers'  evidence  of  individual 
variability.  The  time  spent  per  error  which  can  be  derived  from  the 
measured  data  showed  the  largest  difference  between  the  two  testers. 

The  tester  using  the  SQLAB  test  tools  spent  72  hours  and  found  11 
errors  or  about  6.5  hours  per  error.  The  tester  using  the  compiler's 
trace  facility  spent  50  hours  and  found  15  errors  or  about  3.3  hours 
per  error.  The  ratio  of  6. 5: 3. 3 (1.97),  however,  is  still  not  stati- 
stically significant.  The  compiler-based  tester  felt  that  the  de- 
bugging-trace facility  reduced  the  time  he  spent  to  about  one-half  of 
the  time  he  would  have  spent  inserting  debugging  print  statements 
manually. 

Another  parameter  derived  from  the  measured  data  was  the  amount 
of  time  spent  per  computer  run.  The  tester  using  the  SQLAB  test  tools 
spent  72  hours  and  ran  78  jobs  or  about  56  minutes  per  run.  The  tester 
using  the  compiler's  trace  facility  spent  50  hours  and  ran  90  jobs  or 
about  33  minutes  per  run.  The  ratio  of  56:33  (1.70)  compares  closely 
with  the  ratio  of  1.97  found  for  the  time  spent  per  error.  The  tester 
using  the  SQLAB  test  tools  observed  that  many  of  SQLAB' s diagnostics 
and  warnings  indicated  violations  of  programming  standards  which  did 
not  affect  the  computation  and  hence  were  not  counted  as  errors.  Each 
warning  had  to  be  checked  out,  however,  which  may  account  for  some  of 
the  differences  in  performance. 
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The  activity  reports  which  were  prepared  by  both  testers  indi- 
cated that  the  tester  using  the  compiler's  debugging  facilities  was 
better  able  to  discern  the  effects  due  to  different  errors  in  the  pro- 
gram, He  was,  therefore,  able  to  Isolate  problems,  focus  his  attention, 
and  find  errors  more  quickly.  His  approach  was  to  work  on  finding  the 
cause  of  the  first  discrepancy  which  appeared  in  the  output.  The  rest 
of  the  output  was  disregarded  because  it  contained  symptoms  of  other 
errors  which  would  not  help  locate  the  first  error. 

The  other  tester,  using  the  SQLAB  test  tools,  spent  a considerable 
amount  of  time  studying  the  reports  generated  by  the  tools  and  checking 
out  the  reported  errors  and  warnings.  The  test  program  contained  many 
violations  of  modern  programming  standards  and  practices  which  SQLAB 
faithfully  reported.  Only  four  of  the  22  seeded  errors  were  found  by 
SQLAB' s static  analysis,  yet  the  static  analysis  reports  contain  51 
unrelated  error  and  warning  messages.  Most  of  the  warnings  were  mixed- 
mode Holerlth  expressions,  and  the  error  messages  flagged  "uninitial- 
ized" variables  that  had  been  set  via  their  equlvalenced  names.  This 
aspect  of  SQLAB' s reports  indicates  the  importance  of  using  them  during 
program  development  to  enforce  good  programming  practices.  The  tester 
was  also  misled  by  a modification  which  cleared  several  error  symptoms 
but  did  not  correct  the  error.  The  modification  created  a more  subtle 
"double  error"  in  a section  of  the  program  which  was  thought  to  be 
working  correctly. 

The  engineering  time  spent  by  the  two  multi-error  testers  is  pre- 
sented in  another  form  in  Fig.  6.5.  In  this  figure  the  horizontal 
axis  represents  the  time  spent  by  the  SQLAB  based  tester  and  the  ver- 
tical axis  represents  the  time  spent  by  the  compiler  based  tester.  The 
scale  represents  the  rank  order  of  the  engineering  times  recorded  in 
Table  6,1.  Errors  which  required  more  time  have  higher  rank. 
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The  first  observation  which  one  can  make  is  that  the  errors  de- 
tected by  the  SQLAB-basei  tester  using  static  analysis  required  rela- 
tively little  time  to  identify  and  correct.  This  confirms  the  expected 
utility  of  static  testing  for  a subset  of  the  errors  encountered  and  is 
perhaps  not  too  surprising.  Error  E085  was  well-camouflaged  by  other 
warnings  as  described  earlier  and  the  diagnostics  for  error  E028  were 
somewhat  misleading.  The  effects  are  clearly  displayed  in  Fig.  6.5. 

j 

The  data  from  the  nine  errors  that  were  found  by  both  testers  can 
be  further  analyzed  as  samples  of  the  error  detection  and  correction 
process.  Two  simple  non-parametric  statistical  tests  were  applied  to 
this  data.  The  Wilcoxon  two-sample  test  for  unpaired  samples  showed 
no  significant  difference  in  the  engineering  time  expended  by  the  two 
testers  on  these  errors.  The  Spearman  rank  correlation  was  also  com- 
puted from  this  data  and  was  found  to  be  quite  small  (r=-.208).  The 
significance  of  this  result  is  unknown  and  no  explanation  has  been 
found. 

6.2.3  Examples  of  Errors  Not  Found 

At  the  conclusion  of  the  multiple  error  experiment,  five  errors 
had  not  been  found  by  either  of  the  two  testers.  The  SQLAB-based 
tester  estimated  there  were  considerably  more  errors  left  than  the  11 
she  had  found.  The  compiler-based  tester  knew  there  was  at  least  one 
more  error  but  thought  it  was  probably  the  last. 

The  three  errors  in  the  "computational  errors"  category  proved  to 
be  the  most  difficult  to  find.  This  might  have  been  expected  since 
neither  tester  was  very  familiar  with  formulas  for  missile  flight, 
elliptic  orbits,  or  coordinate  transformations  in  three  dimensions. 

Error  number  E007  was  the  only  error  in  this  category  found  by  both 
testers.  It  was  one  of  the  last  errors  found  and  required  more  than 
average  time  to  discover. 


6-14 


Error  number  E008  was  very  similar  in  form  to  error  E007  but  was 
not  found  by  either  tester.  For  error  E008  an  intermediate  result  in 
the  calculation  of  the  Euler  angles  for  an  orbit  was  calculated  using 
the  wrong  operand  in  the  equation  ((cos(8)  instead  of  sin(6)].  A 
major  contributing  factor  to  the  difficulty  with  this  error  was  that 
the  correct  values  for  the  computed  Euler  angles  were  not  available  to 
the  testers.  Only  after  many  steps  of  intervening  computations  were 
the  effects  of  this  error  finally  exposed. 


Unit  testing  of  the  module  containing  error  E008  would  have 
readily  shown  the  existence  of  an  error.  It  is  believed,  however,  that 
this  error  would  have  been  difficult  to  isolate  and  correct  even  if  the 
search  was  restricted  to  the  program  module  containing  the  error. 


Error  number  E018  was  the  third  error  in  the  "computational"  cate- 
gory and  represented  the  sub-category  "Incorrect  use  of  parenthesis" 
(A200).  The  calculation  of  the  length  of  the  major  axis  of  an  elliptic 
orbit  was  changed  from 

A=GCON*ADIV(R, (2. *GCON-R* (VR**2+VQ**2) ) ) 


to 


A=GGON*ADIV (R , ( 2 . *GCON-R* (VR+VQ ) **  2 ) ) 


Several  additional  orbital  parameters  were  then  computed  using  the  value 
of  A.  As  with  error  E008,  the  correct  values  of  the  orbital  parameters 
were  not  available  to  the  testers  and  the  effects  were  not  evident  until 
some  time  later.  It  should  be  noted  that  the  tester  in  the  single- 
error experiment  also  failed  to  find  this  error. 


Only  one  error  in  the  "logic  error"  category  was  missed  by  both 
testers.  This  was  error  number  E013  in  which  the  wrong  statement  label 
was  assigned  to  a program  variable  thus  causing  a control  flow  error. 
The  "assigned  GOTO"  is  one  of  FORTRAN'S  more  baroque  features  and  it 
was  used  extensively  in  the  module  containing  this  error.  In  fact. 
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the  control  flow  in  this  module  was  so  complex  that  SQLAB's  restruc- 
turing capability  failed  to  sort  it  out.  SQLAB's  restructuring  capa- 
bility is  used  to  convert  unstructured  code  into  structured  code  auto- 
matically but  in  this  case  it  failed  to  complete  the  analysis  of  all 
the  possible  paths  within  this  module. 

Error  number  E078  was  intended  to  simulate  a database  error, 

"data  units  incorrect"  (H300) . The  seeded  error  also  looks  like  an 
"Incorrect  operand  in  equation"  (AlOO) , a computational  error,  although 
it  does  exhibit  units  problems.  In  the  calculation  of  the  position, 
velocity,  and  acceleration  of  an  object  in  orbit  the  intermediate 
result 

Q=2.*ATAN2(X1,X2) 
was  changed  to 

Q=TWOP I *ATAN2 (XI , X2 ) 

where  TWOPI  was  a variable  initialized  to  6.2832  (radians).  The  func- 
tion ATAN2  (arctangent)  returns  an  angle  also  with  units  of  radians. 
Hence  the  value  computed  for  Q,  which  is  an  angular  displacement,  would 
have  Incorrect  units  of  radians-squared.  Neither  of  the  multiple  error 
testers  discovered  this  error.  The  single  error  tester  found  the 
offending  statement  but  was  unable  to  synthesize  the  correction. 


6-16 


I 

I 

I 

I 

I 

I 

I 


I 


7 CONCLUSIONS 

This  project  provided  the  opportunity  for  a critical  and  objective 
assessment  of  the  only  two  automated  testing  techniques  that  are  mature 
enough  to  be  useful,  path  testing  and  static  analysis.  There  are  two 
unique  aspects  of  this  project  that  distinguish  the  results  from 
similar  software  testing  evaluation  experiments. 

1.  The  test  engineers  did  not  know  the  type 
or  location  of  the  program  errors. 

2.  An  automated  test  tool  was  used  for  error 
detection. 

Experience  has  shown  us  that  a simulated  tool  evaluation  of  a particular 
testing  technique  based  on  knowing  the  type  and  location  of  an  error 
does  not  address  many  of  the  difficulties  faced  by  using  a real  tool 
and  not  knowing  anything  about  the  error (s).  Because  software  normally 
contains  numerous  peculiarities  of  design  or  implementation,  what 
constitutes  an  error  may  not  be  obvious.  Furthermore,  automated  test 
tools  (like  compilers)  are  unforgiving  in  their  consistence  checking. 
Static  analysis  is  particularly  affected  by  this  characteristic.  For 
a single,  erroneous  mixed  mode  expression,  there  may  be  hundreds  of 
correct,  intentional  ones,  yet  the  static  analyzer  will  faithfully 
report  all  inconsistencies. 

Similarly,  while  executing  a particular  path  might  cause  an  error 
to  manifest  itself  in  the  output,  doing  so  may  cause  many  other  paths  to 
be  executed,  perhaps  completely  masking  the  error.  This  problem  becomes 
acute  when  the  output-producing  code  is  distant  from  the  source  of  error 
If  the  error  location  is  known  from  the  start,  it  may  be  a simple  matter 
to  determine  the  effectiveness  of  a particular  testing  technique. 

While  the  individual  characteristics  of  the  test  tool  used  in  the 
experiments  undoubtedly  played  a part  in  the  results,  the  primary 
testing  effectiveness,  we  feel,  is  due  to  the  two  techniques  used. 
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For  example,  the  DAVE  system,  a static  analyzer,  was  found  to  detect 
one  class  of  errors  (too  many/too  few  statements  in  a loop:  B500) 
but  unable  to  detect  others  (such  as  missing  logic  or  condition  tests: 
B400).  Similarly,  the  path  testing  tool  used  in  the  experiments,  SQLAB, 
did  not  provide  the  valuable  dynamic  tracing  information  provided  by 
other  path-testing  tools  such  as  the  JOVIAL  Automated  Verification 
System  (JAVS).^  However,  we  believe  that  the  data  generated  in  the 
experiments  provide  a good  foundation  for  some  conclusions  about  the 
testing  methods. 

As  described  in  earlier  sections,  for  these  experiments  an  error 
is  incorrect  Implementation  of  a specification  or  reliance  on  a compi- 
ler's, operating  system's,  or  machine's  nonstandard  capability.  Examples 
of  "nonstandard"  capabilities  are  assuming  storage  is  preset  to  zero  or 
assuming  arrays  adjacently  declared  necessarily  share  contiguous  storage 
space.  The  "nonstandard"  type  of  errors  were  removed  from  the  test 
object  before  starting  the  experiment,  in  order  to  not  make  the  test 
analyst's  task  of  finding  seeded  errors  even  harder.  This  removal  did 
not,  however,  eliminate  the  error  and  warning  messages  described  .in 
Sec.  6.2.2. 

In  addition,  errors  derived  during  the  error-seeding  process  that, 
though  the  site  was  executed,  did  not  manifest  themselves  in  the  out- 
put, were  not  sown  in  the  program  for  subsequent  detection.  This  was 
done  because,  owing  to  the  lack  of  program  specification,  a listing  of 
the  correct  program's  output  was  used  as  the  only  specification.  Al- 
though 22  errors  (25%  of  the  total  errors  generated  by  the  error- 
seeding  process)  whose  sites  were  executed  were  not  used  during  the 


^C.  Gannon  and  N.  B.  Brooks,  JAVS  Technical  Report.  Vol.  1:  User's 
Guide,  General  Research  Corporation  CR-1-722/1,  June  1978. 
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I 
I 

experiments,  their  existence  is  the  basis  for  one  major  conclusion 
I of  this  evaluation:  Errors  may  reside  on  paths  and  statements  that, 

although  executed,  may  not  show  up  in  the  output  for  the  test  data 
I used.  Thus  testing  must  face  the  issue  that  more  information  must  be 

supplied  in  a program  during  development  (at,  undoubtedly,  greater 

(programmer  effort)  to  (1)  direct  testing  of  legal  sequence  of  paths, 
and  (2)  specify  functional  correctness  of  statements  and  paths. 

7.1  EFFECTIVENESS  FOR  ERROR  DETECTION 

When  an  error  is  known  to  exist,  as  in  the  error-type  detection 
experiment  (Phase  2 — single-error  experiment) , it  was  found  that  40  per- 
cent of  the  errors  were  readily  found  by  inspection,  45  percent  more 
were  found  using  path-testing  assistance,  and  the  remaining  15  percent 
were  not  found  or  were  Improperly  corrected.  The  errors  found  using 
path  testing  were  significantly  more  difficult  than  those  found  by 
inspection,  although  no  quantitative  measure  can  be  given  for  "diffi- 
culty." The  average  time  spent  on  errors  found  by  inspection  was  one 
hour,  whereas  for  the  more  difficult  errors  found  by  path  testing  the 
average  time  was  three  hours. 

Path-testing  tools  do  not  generate  error  messages  indicating  the 
source  of  an  error  in  a program.  They  do,  however,  provide  a great 
deal  of  assistance  by  narrowing  the  scope  of  the  search  for  errors  and 
reducing  the  number  of  possible  error  sites  which  must  be  investigated. 
Hence,  path  testing  is  really  enhanced  Inspection.  The  enhancement 
increases  the  probability  of  finding  an  error  by  inspection  from  40  to 
80  percent. 

Path  sequence  information  was  found  (by  the  tracing  capability  of 
the  compiler  used  in  Phase  3 — multi-error  experiment  to  be  more 
valuable  for  finding  errors  than  path  coverage  information.  The  major 
drawback  of  typical  path  tracing  techniques  is  the  volume  of  rather 
I useless  output  surrounding  usually  one  or  two  lines  Indicating 
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incorrect  behavior.  An  improvement  which  could  be  supported  by  a test 
tool  would  be  a condensed  report  which  would  retain  the  valuable  se- 
quence Information.  We  feel  that  research  should  be  directed  toward 
determining  what  a "valuable"  sequence  is.  Of  special  consideration 
are  sequences  which  are  functionally  Important  and  those  which  lead  up 
to  or  include  threshold  or  boundary  conditions. 

Path  testing  was  found  to  be  helpful  in  all  error  categories. 

There  are  examples  of  errors  in  each  category  which  required  the  use  of 
path  coverage  information  to  discover  the  source  of  the  trouble.  How- 
ever, seven  of  the  nine  errors  not  found  in  the  error  type  detection 
experiment  were  from  the  "computational,"  "logic,"  and  "database" 
categories,  indicating  some  weakness  of  path  testing  in  these  areas. 

Static  analysis  is  credited  with  finding  nine  of  the  49  errors 
used  in  the  error-type  detection  experiment — one  of  which  was  not  found 
by  the  path  testing  analyst.  The  economy  of  static  analysis  is  shown  by 
the  cost  of  its  use  (two  engineering  hours  and  24  computer  seconds) 
compared  with  the  path  testing  cost  for  the  same  errors  of  13.5  en- 
gineering hours  and  110  computer  seconds.  Even  though  only  one  of  the 
errors  generally  more  difficult  to  diagnose  was  found  using  static 
analysis,  it  is  an  effective  tool  for  screening  some  errors.  It  has 
the  advantage  of  generating  diagnostic  messages  about  errors  at  their 
statement  location,  and  it  does  not  depend  on  any  knowledge  of  error 
manifestation. 

7.2  EFFECTIVENESS  FOR  VERIFICATION 

Path  testing  provides  little  support  for  determining  the  correct- 
ness of  programs,  even  through  exhaustive  path  coverage.  The  correct 
functioning  of  a program  has  to  be  checked  by  other  means.  The  primary 
function  provided  by  path  coverage  is  an  indication  of  parts  of  a pro- 
gram which  have  not  been  exercised.  Full  path  coverage  does  not  ensure 
complete  or  sufficient  testing,  since  errors  may  occur  on  sequences  of 
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paths  which  have  not  been  tested.  Furthermore,  path  testing  and  static 
analysis  are  not  capable  of  evaluating  functional  correctness  unless 
test  data  are  derived  from  the  software  specification. 

Even  with  these  limitations  in  mind,  there  appears  to  be  consid- 
erable room  for  improvement  in  path-oriented  verification  tools.  The 
missing  ingredient  seems  to  be  a specification  of  the  legal  path  se- 
quences which  a program  should  be  allowed  to  traverse.  The  combina- 
torial nature  of  this  problem  makes  it  Intractable  for  even  small  pro- 
grams. Approximations  or  heuristic  algorithms,  however,  may  yield 
acceptable  solutions  for  many  real  programs. 

Hamlet^  describes  a promising  approach  of  using  "computational 
specifications"  to  complement  the  standard  use  of  "functional  specifi- 
cations" for  programs.  Computational  specifications  impose  additional 
constraints  on  how  results  are  to  be  obtained.  Functional  testing  can 
be  performed  only  on  a small  subset  of  the  input  domain.  However,  if 
correct  results  are  obtained  using  the  prescribed  computation,  then  the 
small  sample  tests  can  be  shown  to  be  reliable.  We  expect  that  path 
sequence  information  will  be  a major  component  in  such  computational 
specifications. 

7.3  VALUE  OF  ERROR  SEEDING 

The  primary  advantage  of  seeding  errors  for  experiments  is  the 
control  it  provides  over  the  types  and  distribution  of  errors  in  a pro- 
gram. Programs  with  authentic  errors  which  satisfy  requirements  for 
testing  experimental  hypotheses  are  simply  not  available  on  demand. 

This  control,  we  feel,  is  more  important  than  the  true  authenticity  of 
the  errors. 


R.  G.  Hamlet,  "Critique  of  Reliability  Theory,"  Workshop  Digest, 
Workshop  on  Software  Testing  and  Test  Documentation,  Ft.  Lauderdale, 
Florida,  December  1978. 
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The  three  testers  involved  in  the  error  type  detection  and  testing 
I technique  evaluation  experiments  in  this  study  agreed  that  the  seeded 

errors  were  very  realistic.  They  did  not  feel  that  the  environment  was 
at  all  artificial  or  contrived.  This  was  probably  due  to  the  care 
taken  in  error  selection  and  seeding.  It  also  Indicates  that  the  re- 
sults of  the  experiments  apply  directly  to  real  programs  with  authentic 
errors. 

One  of  the  factors  that  was  not  controlled  in  our  experiments  was 
the  subtlety  of  the  seeded  errors  and,  hence,  the  difficulty  of  the 
discovery.  Defining  subtlety  may  not  be  easy.  In  general,  the  most 
difficult  errors  to  discover  were  those  which  propagated  Incorrect 
results  through  long  sequences  of  computations  with  no  outward  sign  of 
trouble.  When  the  symptom  finally  surfaced,  the  link  back  to  the 
originating  error  was  completely  obscured.  Using  degree  of  obscurity 
as  a measure  of  subtlety,  one  could  construct  a test  program  seeded  with 
easy  errors,  difficult  errors,  or  some  combination  to  test  an  hypotheses 
about  the  effectiveness  of  a particular  test  tool  or  method. 

An  analogy  can  be  drawn  between  testing  software  and  other 
scientific  investigations.  Error-seeding  experiments  correspond  to 
laboratory  experiments  where  conditions  can  be  controlled  and  many  para- 
meters can  be  measured.  Production  programs  in  actual  use  correspond 
to  field  studies  where  the  conditions  cannot  be  controlled  and  some 
measurements  cannot  be  made.  The  analogy  extends  to  the  need  for  rele- 
vancy between  error-seeding  experiments  and  delivered  software  just  as 
the  need  exists  for  relevance  between  laboratory  experiments  and  field 
studies.  We  highly  recommend  the  practice  of  error-seeding  to  software 
testing  and  verification  tool  developers  as  a measure  of  effectiveness. 
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I APPENDIX  A 

^ Small  Programs  for 

I Preliminary  Analysis 


program  SINEFCN  »INPUT.00TPUTtTAPE5=lNPUTiTApE6=0UTPUTtTAPE12) 
C 

C DRIVER  program  TO  TEST  THE  OOUDLE  PRECISION  SINE  FUNCTION 
C REG  MEESON  7/11/76 
C 

double  precision  SIN<  DSIN,  DOLE.  REF,  vAL«  E 
RCAL  X 
C 

hRlTC(6,100) 

10  READ  (5,110)  X,  C 
NR1TE|6,120)  X,  C 
IF(  E ,EU,  0.  I STOP 
REF  = DS1N(  DBLE(X)  ) 

VAL  = SIN<X,E) 

WRITE(6,130)  REF,  VAL 
GOTO  10 

c 

100  F0RMAT(  26H  SINE  FUNCTICN  TEST  DRIVER  //  )• 

110  FORMAT!  F10.((,  CIO. 2 ) 

120  FORMAT!  3H  X=,  FIO.H,  7H  E=,  D20.12  ) 

130  FORMAT!  IH^,  ‘tSX,  <tHREF=,  020,12,  9H  VAL  = . 020,12  ) 

C 

END 

double  PRECISION  FUNCTION  SIN!X,E) 

C 

C SOURCES  KErNIGHAN  AND  PLAUgER 

c the  elements  cf  programming  style 

C page  77, 

c 

C this  declaration  computes  SIN!X)  to  accuracy  e 

DOUBLE  PRECISION  L, TERM, SUM 

REAL  X 

TERM=X 

DO  20  Is3,100<2 
TERM=TERM*X**2/!I*!I-1)) 

IF(TERM.LT.E»60  to  30 
SUMsSUM+!-1**!I/2J )*TERM 
20  CONTINUE 
30  SlNsSuK 

reyurn 

END 

«» 

•5236  l.OOD-06 

3. 14159  1. 000-08 

-•1  1,000-08 

0,  0,00U-»00 


program  current  (INPUT, OUTPUT, TAPE5=INPUt,TApE6sOUTPUT,TAPE12»  , 

c current  computing  program  ’ 

c 

C sources  mernighan  and  plauger  > 

c the  elements  of  PRCGRAMMING  style  I 

C page  79,  ' 

c 

C INPUT  VALUES  FOR  RESISTANCE,  FRECUENCY  ANO  INDUCTANCE  I ' 

RCAC!5,20)  R,F,L  ( 

20  FoRMAT!3F10.4)  ' , 

C PRINT  values  of  RESISTANCE,  FREOUENCv  ANO  INOuCtANCE 

NR1TE!6,30)  R,F,L  I , 

30  F0KMAT(3H1H=,F14.4,4H  Fs«F14.4,4H  Ls,F14.4)  { : 

C INPUT  starting  and  TERMINATING  VALUES  OF  CAPACITANCE , AND  INCREMENT  ’ ■ 


A-2 


RtAO(S<MUI  SCtTC.CI 
F0RHAT(3F10,6> 

C SET  CaPACIIAnCC  TC  STARTING  VALUE 
CsSC 

C SET  VOLTAGE  TO  STARTING  VALUE 

V=1.0 

C PRINT  VALUE  OF  VOLTAGE 

50  NRITEI616O)  V 

60  F0RNAT(3H0V=,FS.0) 

C COKPUTE  CURRENT  Ai 

70  AI  = E / SGRtIR««2  * (6.2032*F»L  • 1.0/(6.2832«F*C) >*«2) 
c print  values  of  capacitance  and  current 

WRlTE(6ia0)  CiAl 

60  FoRMATiJHOCs.FT.S.MH  I=,F7.5I 
c Increase  value  uf  capacitance 

C S t ♦ ci 

IF  <c  ,LE.  TO  60  TO  70 
c Increase  value  of  voltage 

V = V ♦ 1,0 

C STOP  IF  voltage  is  GREATER  THAN  5.0 

IF  (V  ,LE.  5.0)  GO  TO  SO 
STOP 
ENO 

10.  .159  10. 

.06  .12  .01 


program  NUKALPH  (INPUT, OUTPUT. TAPE5=INPUT.TAPE6=0UTPUT.TAPE12) 
C 

C A program  with  a subtle  initialization  ERROR 
c 

c Source-  kernighan  ano  plauuEr 
c the  elements  cf  programming  style 

C page  00. 

c 

c augmented  to  produce  soke  output  7/11/76  reg  keescn 
c 

dimension  NUM(80) ,NALPHA(80) 

data  NBLANK  /1H  / 

read  (5,101)  NALPhA.NUM 

101  format  (aOAl, 71,0011) 

NR1TE(6,102)  NALPHA,  NUK 

102  F0RMAT(  IIH  INPUT  DATA  / lHO,eOAl  / IH  ,8011  ) 

NUK  = 0 

N = O 

00  30  I s 1,80 

IF  (NALPHA(I)  .EQ.  NBLANk)  60  TO  30 
N s N 4 1 

NSUH  s NSUM  * NUM(l) 

50  continue 

NR1TE(6,103)  N,  NSUM 

103  F0RMAT(  SOHOTHC  number  of  digits  FOUNC  is,  13  / 

6 29H  AND  THE  SUM  OF  THE  DIGITS  IS«  19  ) 

STOP 

END 


5 55  127  3967  129689 

12395  13579  2 9 6 8 10  12  19  16  16  20  5 lO  15  20  25  S 
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PROGRAM  BALANCE  (INPUTiC0TPuTtTAPCS:INPUT«TApC<^s0UTPtTiTAPC12) 

C 

C COMPUTES  A Table  of  monthly  balances  and  interest  charges  for 

C A given  principal  amount,  interest  rate,  and  monthly  FaYMFNT. 

c 

C sources  kernighan  ano  plauger 
c THE  elements  OF  PRGKAMMING  STYLE 

C page  dS. 

c 

C converted  to  fortran  7/11/78  REg  mEESqN 

c 

real  A«  Ht  Mt  B.  C«  P 
C 

10  READ  (5.101)  A.  R.  M 

101  IORFAT(3F10.4) 

WR1TE(6.102)  A.  R.  M 

102  FORKATdHH  THE  AMOUNT  IS«P10.2i 

$ 23H  THE  INTEREST  RATE  IS.F6.2. 

S 25H  THE  MONTHLY  PAYMENT  1S.F8.2) 

IF  (M  .LE.  A«R/12U0«)  go  TO  30 
NRITC(6.103) 

105  format (1H-. 

S59H  MONTH  BALANCE  CHARGE  PAlO  ON  PRINCIPAL  / ) 

DsA 

CO  10  IsliGO 
C3D»R/1200* 

IF  (B4C  .LT.  M)  go  to  20 
psM-C 

18  kRlTE(6«iei)  I.  B.  C.  P 
181  format  <113.  3F13.2> 

20  BPLUSC  = B4C 

NR1TE(C.201)  BPLUSC 

201  format  (35H0THEKE  WILL  EE  A LAST  PAYMENT  OF  . FB.2) 

60  TO  10 
SO  WRITE(6.301) 

SOI  format  (SOHOuNNACCEPTABLE  monthly  payment  ) 

60  TO  10 
END 


500. 

18. 

85. 

100, 

5. 

17, 

1200. 

15. 

12. 

program  81NSRCH  ( INPUT . OUTPUT .TAPtl2 ) 

binary  search  Procedure  to  find  an  element  sa*  in  a table  sx* 
the  elements  in  *x«  must  already  be  sorted  into  increasing  order 

SOURCES  kernighan  ano  plauger 

THE  elements  OF  PROGRAMMING  STYLE 
page  87. 

DIMENSION  X(200).Y(200) 
read  so.  N 
BO  FORMATIIS) 

2 READ  51.  (X(Kl.  Y(K).  K s 1«  N) 

51  FORMAT  (2F10.S) 
read  52. a 

52  format  (FIO.S) 

IF  (X(1)*A)H1«  m.  11 
81  1F(A-XIN))S.  S.  11 
11  PRINT  S3. A 
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53  FoRHATUH  iFlO.Si 

1 26H  is  not  in  range  OF  TABLE.) 

STOP 

3 LOW  s 1 
THIGH  s N 

G IF  (IHIGH-L0W“1)7«  12t  7 
12  print  54«  XLOWt  YLOW,  A.  XHIgH,  YHIGH 
51  FORHATdH  5F10.5) 

STOP 

7 HID  = (LOW  4 IHI6H)/2 
IF  (A.X(KIC))9«  9i  10 

9 IH16H  s mid 
GO  TO  6 
10  LOW  = MID 
GO  TO  6 
END 


7 

>3.2  1. 

>.l  2. 

1.3  3. 

e.7  1, 

20.5  S. 

22.8  6, 

697.1  7. 

1. 


program  1NTEGR8  (OUTPUT .7APE2=0UTPUT . TAPE12 ) 

C 

c integrates  a polynomial  by  Trapezoidal  approximation 
c 

C sources  KErNIGHAN  and  plauger 
C THE  elements  CF  PROGRAMMING  STYLE 

C page  91. 

C 

AREAsO. 

X s 1. 

OELTXsO.l 

9 YsX**242,4X43. 

X:X40ELTX 

YPLUS:X«*242,*X43. 

10  AREA=AREA4(YPLUS4Y)/2.*CELTX 
IF(X>10, )9«1S«15 

15  WRITC(2«7>ARCA 
7 F0HMAT(E20.8| 

STOP 

END 


program  FLOaTPT  (INPUT.OUTPuT.TAPElrOUTPUT. 

S TAPE2sINPUTiTAPE3=0UTPUTiTApei2) 

Tests  for  exact  equality  between  computed  floating  point  numbers 
SOURCE*  MErnIGHAN  AND  PLAUGER 

the  elements  CF  programming  style 
page  93. 


RIGHT  triangles 
Logical  right,  data 
do  1 K B Ifloo 
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OM  uu  u u o 


c 


READ  (2il0>  At  Bt  C 
CHECK  FOR  NLGATIVE  CR  ZERO  DATA 
data  : A.GT.O.  .ANC.  U.GT*0.  tANOt  C.6T.0. 
IFItNOT.OATA)  GO  TO  2 
C CHECK  FOR  KiGhT  TRIANGLE  CONDITION 
A B A**2 
B : B**2 
C s C**2 

RIGHT  = A.CG.B^C  .OR.  B.EO.AtC  .OR.  C.EO.A4B 

1 NRlTE(3tll>  Ki  RIGHT 
CALL  EXIT 

c lrrcr  message 

2 kRlTC(ltl2) 

STOP 

10  FORMATOFIO.M) 

11  F0RKAT(I6tL12) 

12  FORMATdlll  DATA  ERROR) 

END 


1*  2*  St 

5.  12.  13. 

3.  A.  S. 

.05  .12  .13 

.3  .A  .5 

0.  0.  0. 


program  AREATRY  (INPUT. output, TAPE2=lNPUTiTApE3=0DTPLTtTAPE12) 
first  attempt  for  approximating  area  Under  a curve 

SOURCES  KERNI6HAN  ANC  PLAUGER 

THE  ELEMENTS  OF  PROGRAMMING  STYLE 
page  96. 

1 AREAbO.O 
RCAO(2tlO)T 

10  FORMATtFlO.I) 

HsO.l 

XsO.O 

2 XN=-X 

ARCAsAREA4(6.0*(2.0**XN)46.0«(2.0**(XN-H) I )*0.1/2.0 
XsX+H 

1F(X-T)2,6,S 
0 NR1TC(3,33)AREA 
33  roRMAT(7H  AREA  s.FO.S) 

60  TO  1 
9 CALL  EXIT 
END 

3. 

5. 

1. 


1 

I 
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